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AN OLEOSIN 5' REGULATORY REGION FOR THE 
MODIFICATION OF PLANT SEED LIPID COMPOSITION 

BACKGROUND OF THE INVENTION 

Seed oil content has traditionally been 
modified by plant breeding. The use of recombinant 
DNA technology to alter seed oil composition can 
accelerate this process and in some cases alter seed 
oils in a way that cannot be accomplished by breeding 
alone. The oil composition of Brassica has been 
significantly altered by modifying the expression of a 
number of lipid metabolism genes. Such manipulations 
of seed oil composition have focused on altering the 
proportion of endogenous component fatty acids. For 
example, antisense repression of the A12 - desaturase 
gene in transgenic rapeseed has resulted in an 
increase in oleic acid of up to 83%. Topfer et al . 
1995 Science 268 : 6 81 - 6 86 . 

There have been some successful attempts at 
modifying the composition of seed oil in transgenic 
plants by introducing new genes that allow the 
production of a fatty acid that the host plants were 
not previously capable of synthesizing. Van de Loo, 
et al. (1995 Proc . Natl. Acad. Sci USA 92 : 67 43 - 67 47 ) 
have been able to introduce a A12 -hydroxylase gene 
into transgenic tobacco, resulting in the introduction 
of a novel fatty acid, ricinoleic acid, into its seed 
oil. The reported accumulation was modest from plants 
carrying constructs in which transcription of the 
hydroxylase gene was under the control of the 
cauliflower mosaic virus (CaMV) 35S promoter. 
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Similarly, tobacco plants have been engineered to 
produce low levels of petroselinic acid by expression 
of an acyl-ACP desaturase from coriander (Cahoon et 
al. 1992 Proc. Natl. Acad. Sci USA 89:11184-11188). 

The long chain fatty acids (C18 and larger) , 
have significant economic value both as nutritionally 
and medically important foods and as industrial 
commodities (Ohlrogge , J . B . 1994 Plant Physiol. 
104:821-826). Linoleic (18:2 A9,12) and a-linolenic 
acid (18:3 A9,12,15) are essential fatty acids found 
in many seed oils. The levels of these fatty-acids 
have been manipulated in oil seed crops through 
breeding and biotechnology (Ohlrogge, et al . 1991 
Biochim. Biophys. Acta 1082 :l-26 ; Topfer et al . 1995 
Science 268:681-686). Additionally, the production of 
novel fatty acids in seed oils can be of considerable 
use in both human health and industrial applications. 

Consumption of plant oils rich in y~ 
linolenic acid (GLA) (18:3 A6,9,12) is thought to 
alleviate hypercholesterolemia and other related 
clinical disorders which correlate with susceptibility 
to coronary heart disease (Brenner R.R. 1976 Adv. Exp. 
Med. Biol. 83:85-101). The therapeutic benefits of 
dietary GLA may result from its role as a precursor to 
prostaglandin synthesis (Weete, J.D. 1980 in Lipid 
Biochemistry of Fungi and Other Organisms , eds. Plenum 
Press, New York, pp. 59-62). Linoleic acid(18:2) (LA) 
is transformed into gamma linolenic acid (18:3) (GLA) 
by the enzyme A6 - desaturase . 
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Few seed oils contain GLA despite high 
contents of the precursor linoleic acid. This is due 
to the absence of A6 - desaturase activity in most 
plants. For example, only borage (Borago 
officinalis) , evening primrose (Oenothera biennis) , 
and currants (Ribes nigrum) produce appreciable 
amounts of linolenic acid. Of these three species, 
only Oenothera and Borage are cultivated as a 
commercial source for GLA. It would be beneficial if 
agronomic seed oils could be engineered to produce GLA 
in significant quantities by introducing a 
heterologous A6 - desaturase gene. It would also be 
beneficial if other expression products associated 
with fatty acid synthesis and lipid metabolism could 
be produced in plants at high enough levels so that 
commercial production of a particular expression 
product becomes feasible. 

As disclosed in U.S. Patent No. 5,552,306, a 
cyanobacterial A 6 - desaturase gene has been recently 
isolated. Expression of this cyanobacterial gene in 
transgenic tobacco resulted in significant but low 
level GLA accumulation. (Reddy et al . 1996 Nature 
Biotech. 24:639-642). Applicant's copending U.S. 
Application Serial No. 08,366,779, discloses a A6 - 
desaturase gene isolated from the plant Borago 
officinalis and its expression in tobacco under the 
control of the CaMV 3 5S promoter. Such expression 
resulted in significant but low level GLA and 
octadecatetraenoic acid (ODTA or OTA) accumulation in 
seeds. Thus, a need exists for a promoter which 
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functions in plants and which consistently directs 
high level expression of lipid metabolism genes in 
transgenic plant seeds. 

Oleosins are abundant seed proteins 
associated with the phospholipid monolayer membrane of 
oil bodies. The first oleosin gene, L3 , was cloned 
from maize by selecting clones whose in vitro 
translated products were recognized by an anti-L3 
antibody (Vance et al . 1987 J. Biol. Chem. 262:11275- 
11279) . Subsequently, different isoforms of oleosin 
genes from such different species as Brassica , 
soybean, carrot, pine, and Arabidopsis have been 
cloned (Huang, A.H.C., 1992, Ann. Reviews Plant Phys . 
and Plant Mol . Biol. 43:171 -200 ; Kirik et al . , 1996 
Plant Mol. Biol. 31:413-417; Van Rooijen et al . , 1992 
Plant Mol. Biol. 18:1171 -1179 ; Zou et al . , Plant Mol. 
Biol. 31:429-433. Oleosin protein sequences predicted 
from these genes are highly conserved, especially for 
the central hydrophobic domain. All of these oleosins 
have the characteristic feature of three distinctive 
domains. An amphipathic domain of 40-60 amino acids 
is present at the N- terminus; a totally hydrophobic 
domain of 68-74 amino acids is located at the center; 
and an amphipathic a-helical domain of 33-40 amino 
acids is situated at the C- terminus (Huang, A.H.C. 
1992) . 

The present invention provides 5 1 regulatory 
sequences from an oleosin gene which direct high level 
expression of lipid metabolism genes in transgenic 
plants. In accordance with the present invention, 
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chimeric constructs comprising an oleosin 5 ! 
regulatory region operably linked to coding sequence 
for a lipid metabolism gene such as .a A6 - desaturase 
gene are provided. Transgenic plants comprising the 
subject chimeric constructs produce levels of GLA 
approaching the level found in those few plant species 
which naturally produce GLA such as evening primrose 
(Oenothera biennis) . 

SUMMARY OF THE INVENTION 

The present invention is directed to 5' 
regulatory regions of an Arabidopsis oleosin gene. 
The 5' regulatory regions, when operably linked to 
either the coding sequence of a heterologous gene or 
sequence complementary to a native plant gene, direct 
expression of the heterologous gene or complementary 
sequence in a plant seed. 

The present invention thus provides 
expression cassettes and expression vectors comprising 
an oleosin 5' regulatory region operably linked to a 
heterologous gene or a sequence complementary to a 
native, plant gene. 

Plant transformation vectors comprising the 
expression cassettes and expression vectors are also 
provided as are plant cells transformed by these 
vectors, and plants and their progeny containing the 
vectors . 

In one embodiment of the invention, the 
heterologous gene or complementary gene sequence is a 
fatty acid synthesis gene or a lipid metabolism gene. 
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In another aspect of the present invention, 
a method is provided for producing a plant with 
increased levels of a product of a fatty acid 
synthesis or lipid metabolism gene. 

In particular, there is provided a method 
for producing a plant with increased levels of a fatty 
acid synthesis or lipid metabolism gene by 
transforming a plant with the subject expression 
cassettes and expression vectors which comprise an 
oleosin 5' regulatory region and a coding sequence for 
a fatty acid synthesis or lipid metabolism gene. 

In another aspect of the present invention, 
there is provided a method for cosuppressing a native 
fatty acid synthesis or lipid metabolism gene by 
transforming a plant with the subject expression 
cassettes and expression vectors which comprise an 
oleosin 5' regulatory region and a coding sequence for 
a fatty acid synthesis or lipid metabolism gene. 

A further aspect of this invention provides 
a method of decreasing production of a native plant 
gene such as a fatty acid synthesis gene or a lipid 
metabolism gene by transforming a plant with an 
expression vector comprising a oleosin 5' regulatory 
region operably linked to a nucleic acid sequence 
complementary to a native plant gene. 

Also provided are methods of modulating the 
levels of a heterologous gene such as a fatty acid 
synthesis or lipid metabolism gene by transforming a 
plant with the subject expression cassettes and 
expression vectors. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1 depicts the nucleotide and 
corresponding amino acid sequence of the borage A6 - 
desaturase gene (SEQ ID N0:1). The cytochrome b5 
heme-binding motif is boxed and the putative metal 
binding, histidine rich motifs (HRMs) are underlined. 
The motifs recognized by the primers (PCR analysis) 
are underlined with dotted lines, i.e. tgg aaa tgg aac 
cat aa; and gag cat cat ttg ttt cc. 

Fig. 2 is a dendrogram showing similarity of 
the borage A6 - desaturase to other membrane - bound 
desaturases. The amino acid sequence of the borage A6 - 
desaturase was compared to other known desaturases 
using Gene Works (IntelliGenetics) . Numerical values 
correlate to relative phylogenetic distances between 
subgroups compared . 

Fig. 3A provides a gas liquid chromatography 
profile of the fatty acid methyl esters (FAMES) 
derived from leaf tissue of a wild type tobacco 
' Xanthi' . 

Fig. 3B provides a gas liquid chromatography 
profile of the FAMES derived from leaf tissue of a 
tobacco plant transformed with the borage A6- 
desaturase cDNA under transcriptional control of the 
CaMV 3 5S promoter (pAN2) . Peaks corresponding to 
methyl linoleate (18:2), methyl ylinolenate (18:3y), 
methyl a-linolenate (18:3a), and methyl 
octadecatetraenoate (18:4) are indicated. 
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Fig. 4 is the nucleotide sequence and 
corresponding amino acid sequence of the oleosin AtS21 
CDNA (SEQ ID NO: 3) . 

Fig. 5 is an acidic-base map of the 
predicted AtS21 protein generated by DNA Strider 1.2. 

Fig. 6 is a Kyte-Doolittle plot of the 
predicted AtS21 protein generated by DNA Strider 1.2. 

Fig. 7 is a sequence alignment of oleosins 
isolated from Arabidopsis . Oleosin sequences 
published or deposited in EMBL, BCM, NCBI databases 
were aligned to each other using GeneWorks® 2.3. 
Identical residues are boxed with rectangles. The 
seven sequences fall into three groups. The first 
group includes AtS21 (SEQ ID NO : 5 ) , X91918 (SEQ ID 
N0:6) and Z29859 (SEQ ID NO.: 7). The second group 
includes X62352 (SEQ ID N0:8) and Atol3 (SEQ ID N0:9) . 
The third group includes X91956 (SEQ ID NO: 10) and 
L40954 (SEQ ID NO: 11). Differences in amino acid 
residues within the same group are indicated by 
shadows. Ato2/Z54164 is identical to AtS21. Atol3 
sequence (Accession No. Z541654 in EMBL database) is 
actually not disclosed in the EMBL database. The 
Z54165 Accession number designates the same sequence 
as Z54164 which is Atol2 . 

Fig. 8A is a Northern analysis of the AtS21 
gene. An RNA gel blot containing ten micrograms of 
total RNA extracted from Arabidopsis flowers (F) , 
leaves (L) , roots (R) , developing seeds (Se) , and 
developing silique coats (Si) was hybridized with a 
probe made from the full-length AtS21 cDNA. 
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Fig. 8B is a Southern analysis of the AtS21 
gene. A DNA gel blot containing ten micrograms of 
genomic DNA digested with BamHI (B) , EcoRI (E) , 
Hindlll (H) , SacI (S) , and Xbal (X) was hybridized 
with a probe made from the full length AtS21 cDNA. 

Fig. 9 is the nucleotide sequence of the 
SacI fragment of AtS21 genomic DNA (SEQ ID N0:12). 
The promoter and intron sequences are in uppercase. 
The fragments corresponding to AtS21 cDNA sequence are 
in lower case. The first ATG codon and a putative 
TATA box are shadowed. The sequence complementary to 
21P primer for PCR amplification is boxed. A putative 
abscisic acid response element (ABRE) and two 14 bp 
repeats are underlined. 

Fig. 10 is a map of AtS21 promoter/GUS 
construct (pAN5) . 

Fig. 11A depicts AtS21/GUS gene expression 
in Arabidopsis bolt and leaves. 

Fig. 11B depicts AtS21 GUS gene expression 
in Arabidopsis siliques. 

Fig. 11C depicts AtS21 GUS gene expression 
in Arabidopsis developing seeds. 

Figs. 11D through 11J depict AtS21 GUS gene 
expression in Arabidopsis developing embryos. 

Fig. UK depicts AtS21/GUS gene expression 
in Arabidopsis root and root hairs of a young 
seedling . 

Fig. 11L depicts AtS21/GUS gene expression 
in Arabidopsis cotyledons and the shoot apex of a five 
day seedling. 
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Figs. 11M and UN depict AtS21/GUS gene 
expression in Arabidopsis cotyledons and the shoot 
apex of 5-15 day seedlings. 

Fig. 12A depicts AtS21/GUS gene expression 
in tobacco embryos and endosperm. 

Fig. 12B depicts AtS21/GUS gene expression 
in germinating tobacco seeds. 

Fig. 12C depicts AtS21/GUS gene expression 
in a 5 day old tobacco seedling. 

Fig. 12D depicts AtS21/GUS gene expression 
in 5-15 day old tobacco seedlings. 

Fig. 13A is a Northern analysis showing 
AtS21 mRNA levels in developing wild- type Arabidopsis 
seedlings. Lane 1 was loaded with RNA from developing 
seeds, lane 2 was loaded with RNA from seeds imbibed 
for 24-48 hours, lane 3: 3 day seedlings; lane 4: 4 
day seedlings; lane 5: 5 day seedlings; lane 6: 6 day 
seedlings; lane 7; 9 day seedlings; lane 8: 12 day 
seedlings. Probe was labeled AtS21 cDNA. Exposure 
was for one hour at -80°C. 

Fig. 13B is the same blot as Fig. 13A only 
exposure was for 24 hours at -80°C. 

Fig. 13C is the same blot depicted in Figs. 
13A and 13B after stripping and hybridization with an 
Arabidopsis tubulin gene probe. The small band in 
each of lanes 1 and 2 is the remnant of the previous 
AtS21 probe. Exposure was for 48 hours at -80°C. 

Fig. 14 is a graph comparing GUS activities 
expressed by the AtS21 and 35S promoters. GUS 
activities expressed by the AtS21 promoter in 
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developing Arabidopsis seeds and leaf are plotted side 
by side with those expressed by the 3 5S promoter. The 
GUS activities expressed by the AtS21 promoter in 
tobacco dry seed and leaf are plotted on the right 
side of the figure. GUS activity in tobacco leaf is 
so low that no column appears. "G-H" denotes globular 
to heart stage; "H-T" denotes heart to torpedo stage; 
"T-C" denotes torpedo to cotyledon stage; "Early C" 
denotes early cotyledon; "Late C" denotes late 
cotyledon. The standard deviations are listed in 
Table 2 . 

Fig. 15A is an RNA gel blot analysis carried 
out on 5 ug samples of RNA isolated from borage leaf, 
root, and 12 dpp embryo tissue, using labeled borage 
A6 - desaturase cDNA as a hybridization probe. 

Fig. 15B depicts a graph corresponding to 
the Northern analysis results for the experiment shown 
in Fig. 15A. 

Fig. 16A is a graph showing relative legumin 
RNA accumulation in developing borage embryos based on 
results of Northern blot. 

Fig. 16B is a graph showing relative 
oleosin RNA accumulation in developing borage embryos 
based on results of Northern blot. 

Fig. 16C is a graph showing relative A6 - 
desaturase RNA accumulation in developing borage 
embryos based on results of Northern blot. 

Fig. 17 is a PCR analysis showing the 
presence of the borage delta 6 -desaturase gene in 
transformed plants of oilseed rape. Lanes 1, 3 and 4 
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were loaded with PCR reactions performed with DNA from 
plants transformed with the borage delta 6-desaturase 
gene linked to the oleosin 5* regulatory region; lane 
2: DNA from plant transformed with the borage delta 6- 
desaturase gene linked to the albumin 5' regulatory 
region; lanes 5 and 6: DNA from non - transformed 
plants; lane 7: molecular weight marker (1 kb ladder, 
Gibco BRL) ; lane 8: PCR without added template DNA; 
lane 9: control with DNA from Agrobacterium 
tumefaciens EHA 105 containing the plasmid pAN3 (i.e. 
the borage del ta6 - desaturase gene linked to the 
oleosin 5 ' regulatory region) . 

DETAILED DESCRIPTION OF THE INVENTION 

The present invention provides isolated 
nucleic acids encoding 5 1 regulatory regions from an 
Arabidopsis oleosin gene. In accordance with the 
present invention, the subject 5' regulatory regions, 
when operably linked to either a coding sequence of a 
heterologous gene or a sequence complementary to a 
native plant gene, direct expression of the coding 
sequence or complementary sequence in a plant seed. 
The oleosin 5' regulatory regions of the present 
invention are useful in the construction of an 
expression cassette which comprises in the 5' to 3' 
direction, a subject oleosin 5' regulatory region, a 
heterologous gene or sequence complementary to a 
native plant gene under control of the regulatory 
region and a 3' termination sequence. Such an 
expression cassette can be incorporated into a variety 
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of autonomously replicating vectors in order to 
construct an expression vector. 

It has been surprisingly found that plants 
transformed with the expression vectors of the present 
invention produce levels of GLA approaching the level 
found in those few plant species which naturally 
produce GLA such as evening primrose (Oenothera 
biennis) . 

As used herein, the term "cassette" refers 
to a nucleotide sequence capable of expressing a 
particular gene if said gene is inserted so as to be 
operably linked to one or more regulatory regions 
present in the nucleotide sequence. Thus, for 
example, the expression cassette may comprise a 
heterologous coding sequence which is desired to be 
expressed in a plant seed. The expression cassettes 
and expression vectors of the present invention are 
therefore useful for directing seed- specif ic 
expression of any number of heterologous genes. The 
term "seed- specif ic expression" as used herein, refers 
to expression in various portions of a plant seed such 
as the endosperm and embryo. 

An isolated nucleic acid encoding a 5' 
regulatory region from an oleosin gene can be provided 
as follows. Oleosin recombinant genomic clones are 
isolated by screening a plant genomic DNA library with 
a cDNA (or a portion thereof) representing oleosin 
mRNA. A number of different oleosin cDNAs have been 
isolated. The methods used to isolate such cDNAs as 
well as the nucleotide and corresponding amino acid 
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sequences have been published in Kirik et al . 1986 
Plant Mol. Biol. 31:413-417; Zou et al . Plant Mol . 
Biol. 31:429-433; Van Rooigen et al . 1992 Plant Mol. 
Biol. 10:1177-1179. 

Virtual subtraction screening of a tissue 
specific library using a random primed polymerase 
chain (RP-PCR) cDNA probe is another method of 
obtaining an oleosin cDNA useful for screening a plant 
genomic DNA library. Virtual subtraction screening 
refers to a method where a cDNA library is constructed 
from a target tissue and displayed at a low density so 
that individual cDNA clones can be easily separated. 
These cDNA clones are subtractively screened with 
driver quantities (i.e., concentrations of DNA to 
kinetically drive the hybridization reaction) of cDNA 
probes made from tissue or tissues other than the 
target tissue (i.e. driver tissue). The hybridized 
plaques represent genes that are expressed in both the 
target and the driver tissues; the unhybridized 
plaques represent genes that may be target tissue- 
specific or low abundant genes that can not be 
detected by the driver cDNA probe. The unhybridized 
cDNAs are selected as putative target tissue - specif ic 
genes and further analyzed by one-pass sequencing and 
Northern hybridization. 

Random primed PCR (RP-PCR) involves 
synthesis of large quantities of cDNA probes from a 
trace amount of cDNA template. The method combines 
the amplification power of PCR with the representation 
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of random priming to simultaneously amplify and label 
double- stranded cDNA in a single tube reaction. 

Methods considered useful in obtaining 
oleosin genomic recombinant DNA are provided in 
Sambrook et al . 1989, in Molecular Cloning: A 
Laboratory Manual, Cold Spring Harbor, NY, for 
example, or any of the myriad of laboratory manuals on 
recombinant DNA technology that are widely available. 
To determine nucleotide sequences, a multitude of 
techniques are available and known to the ordinarily 
skilled artisan. For example, restriction fragments 
containing an oleosin regulatory region can be 
subcloned into the polylinker site of a sequencing 
vector such as pBluescript (Stratagene) . These 
pBluescript subclones can then be sequenced by the 
double- stranded dideoxy method (Chen and Seeburg, 
1985, DNA 4:165) . 

In a preferred embodiment, the oleosin 
regulatory region comprises nucleotides 1-1267 of Fig. 
9 (SEQ ID NO: 12). Modifications to the oleosin 
regulatory region as set forth in SEQ ID NO: 12 which 
maintain the characteristic property of directing 
seed- specif ic expression, are within the scope of the 
present invention. Such modifications include 
insertions, deletions and substitutions of one or more 
nucleotides . 

The 5 1 regulatory region of the present 
invention can be derived from restriction endonuclease 
or exonuclease digestion of an oleosin genomic clone. 
Thus, for example, the known nucleotide or amino acid 
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sequence of the coding region of an isolated oleosin 
gene (e.g. Fig. 7) is aligned to the nucleic acid or 
deduced amino acid sequence of an isolated oleosin 
genomic clone and 5 1 flanking sequence (i.e., sequence 
upstream from the translational start codon of the 
coding region) of the isolated oleosin genomic clone 
located . 

The oleosin 5 1 regulatory region as set 
forth in SEQ ID NO: 12 (nucleotides 1-1267 of Fig. 9) 
may be generated from a genomic clone having either or 
both excess 5' flanking sequence or coding sequence by 
exonuclease III -mediated deletion. This is 
accomplished by digesting appropriately prepared DNA 
with exonuclease III (exoIII) and removing aliquots at 
increasing intervals of time during the digestion. 
The resulting successively smaller fragments of DNA 
may be sequenced to determine the exact endpoint of 
the -deletions. There are several commercially 
available systems which use exonuclease III (exoIII) 
to create such a deletion series, e.g. Promega 
Biotech, "Erase-A-Base" system. Alternatively, PCR 
primers can be defined to allow direct amplification 
of the subject 5' regulatory regions. 

Using the same methodologies, the 
ordinarily skilled artisan can generate one or more 
deletion fragments of nucleotides 1-1267 as set forth 
in SEQ ID NO: 12. Any and all deletion fragments which 
comprise a contiguous portion of nucleotides set forth 
in SEQ ID NO: 12 and which retain the capacity to 
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direct seed- specif ic expression are contemplated by 
the present invention. 

The identification of oleosin 5* regulatory 
sequences which direct seed- specif ic expression 
comprising nucleotides 1-1267 of SEQ ID N0:12 and 
modifications or deletion fragments thereof, can be 
accomplished by transcriptional fusions of specific 
sequences with the coding sequences of a heterologous 
gene, transfer of the chimeric gene into an 
appropriate host, and detection of the expression of 
the heterologous gene. The assay used to detect 
expression depends upon the nature of the heterologous 
sequence. For example, reporter genes, exemplified by 
chloramphenicol acetyl transferase and 3 - glucuronidase 
(GUS) , are commonly used to assess transcriptional and 
translational competence of chimeric constructions. 
Standard assays are available to sensitively detect 
the reporter enzyme in a transgenic organism. The (3- 
glucuronidase (GUS) gene is useful as a reporter of 
promoter activity in transgenic plants because of the 
high stability of the enzyme in plant cells, the lack 
of intrinsic ft - glucuronidase activity in higher plants 
and availability of a quantitative fluorimetric assay 
and a his tochemical localization technique. Jefferson 
et al. (1987 EMBO J 6: 3901) have established standard 
procedures for biochemical and his tochemical detection 
of GUS activity in plant tissues. Biochemical assays 
are performed by mixing plant tissue lysates with 4- 
methylumbellif eryl - 3 -D - glucuronide , a fluorimetric 
substrate for GUS, incubating one hour at 37°C, and 



WO 98/45461 



PCTYUS98/07179 



-18- 



then measuring the fluorescence of the resulting 4- 
methyl -umbellif erone. Histochemical localization for 
GUS activity is determined by incubating plant tissue 
samples in 5 -bromo- 4 - chloro - 3 - indolyl - glucuronide (X- 
Gluc) for about 18 hours at 37°C and observing the 
staining pattern of X-Gluc. The construction of such 
chimeric genes allows definition of specific 
regulatory sequences and demonstrates that these 
sequences can direct expression of heterologous genes 
in a seed- specif ic manner. 

Another aspect of the invention is directed 
to expression cassettes and expression vectors (also 
termed herein "chimeric genes") comprising a 5' 
regulatory region from an oleosin gene which directs 
seed specific expression operably linked to the coding 
sequence of a heterologous gene such that the 
regulatory element is capable of controlling 
expression of the product encoded by the heterologous 
gene. The heterologous gene can be any gene other 
than oleosin. If necessary, additional regulatory 
elements or parts of these elements sufficient to 
cause expression resulting in production of an 
effective amount of the polypeptide encoded by the 
heterologous gene are included in the chimeric 
constructs . 

Accordingly, the present invention provides 
chimeric genes comprising sequences of the oleosin 5' 
regulatory region that confer seed - specif ic expression 
which are operably linked to a sequence encoding a 
heterologous gene such as a lipid metabolism enzyme. 
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Examples of lipid metabolism genes useful for 
practicing the present invention include lipid 
desaturases such as A6 - desaturases , A12 - desaturases , 
A15 - desaturases and other related desaturases such as 
stearoyl-ACP desaturases, acyl carrier proteins 
(ACPs) , thioesterases , acetyl transacylases , acetyl - 
coA carboxylases, ketoacyl - synthases , malonyl 
transacylases, and elongases . Such lipid metabolism 
genes have been isolated and characterized from a 
number of different bacteria and plant species. Their 
nucleotide coding sequences as well as methods of 
isolating such coding sequences are disclosed in the 
published literature and are widely available to those 
of skill in the art. 

In particular, the A6 - desaturase genes 
disclosed in U.S. Patent No. 5,552,306 and 
applicants' copending U.S. Application Serial No. 
08/366,779 filed December 30, 1994 and incorporated 
herein by reference, are contemplated as lipid 
metabolism genes particularly useful in the practice 
of the present invention. 

The chimeric genes of the present invention 
are constructed by ligating a 5* regulatory region of 
a oleosin genomic DNA to the coding sequence of a 
heterologous gene. The juxtaposition of these 
sequences can be accomplished in a variety of ways. 
In a preferred embodiment the order of the sequences, 
from 5 1 to 3 1 , is an oleosin 5' regulatory region 
(including a promoter) , a coding sequence, and a 
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termination sequence which includes a polyadenylation 
site . 

Standard techniques for construction of such 
chimeric genes are well known to those of ordinary 
skill in the art and can be found in references such 
as Sambrook et al . (1989) . A variety of strategies are 
available for ligating fragments of DNA, the choice of 
which depends on the nature of the termini of the DNA 
fragments. One of ordinary skill in the art 
recognizes that in order for the heterologous gene to 
be expressed, the construction requires promoter 
elements and signals for efficient polyadenylation of 
the transcript. Accordingly, the oleosin 5' 
regulatory region that contains the consensus promoter 
sequence known as the TATA box can be ligated directly 
to a promoterless heterologous coding sequence. 

The restriction or deletion fragments that 
contain the oleosin TATA box are ligated in a forward 
orientation to a promoterless heterologous gene such 
as the coding sequence of 3 - glucuronidase (GUS) . The 
skilled artisan will recognize that the subject 
oleosin 5 1 regulatory regions can be provided by other 
means, for example chemical or enzymatic synthesis. 
The 3 1 end of a heterologous coding sequence is 
optionally ligated to a termination sequence 
comprising a polyadenylation site, exemplified by, but 
not limited to, the nopaline synthase polyadenylation 
site, or the octopine T-DNA gene 7 polyadenylation 
site. Alternatively, the polyadenylation site can be 
provided by the heterologous gene. 
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The present invention also provides methods 
of increasing levels of heterologous genes in plant 
seeds. In accordance with such methods, the subject 
expression cassettes and expression vectors are 
introduced into a plant in order to effect expression 
of a heterologous gene. For example, a method of 
producing a plant with increased levels of a product 
of a fatty acid synthesis or lipid metabolism gene is 
provided by transforming a plant cell with an 
expression vector comprising an oleosin 5' regulatory 
region operably linked to a fatty acid synthesis or 
lipid metabolism gene and regenerating a plant with 
increased levels of the product of said fatty acid 
synthesis or lipid metabolism gene. 

Another aspect of the present invention 
provides methods of reducing levels of a product of a 
gene which is native to a plant which comprises 
transforming a plant cell with an expression vector 
comprising a subject oleosin regulatory region 
operably linked to a nucleic acid sequence which is 
complementary to the native plant gene. In this 
manner, levels of endogenous product of the native 
plant gene are reduced through the mechanism known as 
antisense regulation. Thus, for example, levels of a 
product of a fatty acid synthesis gene or lipid 
metabolism gene are reduced by transforming a plant 
with an expression vector comprising a subject oleosin 
5 ' regulatory region operably linked to a nucleic acid 
sequence which is complementary to a nucleic acid 
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sequence coding for a native fatty acid synthesis or 
lipid metabolism gene. 

The present invention also provides a method 
of cosuppressing a gene which is native to a plant 
which comprises transforming a plant cell with an 
expression vector comprising a subject oleosin 5' 
regulatory region operably linked to a nucleic acid 
sequence coding for the native plant gene. In this 
manner, levels of endogenous product of the native 
plant gene are reduced through the mechanism known as 
cosuppression . Thus, for example, levels of a product 
of a fatty acid synthesis gene or lipid metabolism 
gene are reduced by transforming a plant with an 
expression vector comprising a subject oleosin 5' 
regulatory region operably linked to a nucleic acid 
sequence coding for a native fatty acid synthesis or 
lipid metabolism gene native to the plant. Although 
the -exact mechanism of cosuppression is not completely 
understood, one skilled in the art is familiar with 
published works reporting the experimental conditions 
and results associated with cosuppression (Napoli et 
al . 1990 The Plant Cell 2:270-289; Van der Krol 1990 
The Plant Cell 2:291-299. 

To provide regulated expression of the 
heterologous or native genes, plants are transformed 
with the chimeric gene constructions of the invention. 
Methods of gene transfer are well known in the art. 
The chimeric genes can be introduced into plants by 
leaf disk transformation- regeneration procedure as 
described by Horsch et al . 1985 Science 227:1229 . 
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Other methods of transformation such as protoplast 
culture (Horsch et al . 1984 Science 223:496, DeBlock 
et al. 1984 EMBO J. 2:2143, Barton et al . 1983, Cell 
32:1033) can also be used and are within the scope of 
this invention. In a preferred embodiment, plants are 
transformed with Agrobacterium- derived vectors such as 
those described in Klett et al . (1987) Annu. Rev. 
Plant Physiol. 38:467. Other well-known methods are 
available to insert the chimeric genes of the present 
invention into plant cells. Such alternative methods 
include biolistic approaches (Klein et al . 1987 Nature 
327:70), electroporation , chemically- induced DNA 
uptake, and use of viruses or pollen as vectors. 

When necessary for the transformation 
method, the chimeric genes of the present invention 
can be inserted into a plant transformation vector, 
e.g. the binary vector described by Bevan, M. 1984 
Nucleic Acids Res. 12:8711-8721. Plant transformation 
vectors can be derived by modifying the natural gene 
transfer system of Agrobacterium tumefaciens. The 
natural system comprises large Ti ( tumor - inducing) - 
plasmids containing a large segment, known as T - DNA , 
which is transferred to transformed plants. Another 
segment of the Ti plasmid, the vir region, is 
responsible for T - DNA transfer. The T-DNA region is 
bordered by terminal repeats. In the modified binary 
vectors, the tumor inducing genes have been deleted 
and the functions of the vir region are utilized to 
transfer foreign DNA bordered by the T-DNA border 
sequences. The T-region also contains a selectable 
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marker for antibiotic resistance, and a multiple 
cloning site for inserting sequences for transfer. 
Such engineered strains are known as "disarmed" A . 
tumefaciens strains, and allow the efficient transfer 
of sequences bordered by the T- region into the nuclear 
genome of plants. 

Surf ace- sterilized leaf disks and other 
susceptible tissues are inoculated with the "disarmed" 
foreign DNA- containing A . tumefaciens , cultured for a 
number of days, and then transferred to antibiotic - 
containing medium. Transformed shoots are then 
selected after rooting in medium containing the 
appropriate antibiotic, and transferred to soil. 
Transgenic plants are pollinated and seeds from these 
plants are collected and grown on antibiotic medium. 

Expression of a heterologous or reporter 
gene in developing seeds, young seedlings and mature 
plants can be monitored by immunological, 
histochemical or activity assays. As discussed 
herein, the choice of an assay for expression of the 
chimeric gene depends upon the nature of the 
heterologous coding region. For example, Northern 
analysis can be used to assess transcription if 
appropriate nucleotide probes are available. If 
antibodies to the polypeptide encoded by the 
heterologous gene are available, Western analysis and 
immunohistochemical localization can be used to assess 
the production and localization of the polypeptide. 
Depending upon the heterologous gene, appropriate 
biochemical assays can be used. For example, 
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acetyltransf erases are detected by measuring 
acetylation of a standard substrate. The expression 
of a lipid desaturase gene can be assayed by analysis 
of fatty acid methyl esters (FAMES) . 

Another aspect of the present invention 
provides transgenic plants or progeny of these plants 
containing the chimeric genes of the invention. Both 
monocotyledonous and dicotyledonous plants are 
contemplated. Plant cells are transformed with the 
chimeric genes by any of the plant transformation 
methods described above. The transformed plant cell, 
usually in the form of a callus culture, leaf disk, 
explant or whole plant (via the vacuum infiltration 
method of Bechtold et al . 1993 C.R. Acad. Sci . Paris, 
325:1194-1199) is regenerated into a complete 
transgenic plant by methods well-known to one of 
ordinary skill in the art (e.g. Horsch et al . 1985 
Science 227:1129). In a preferred embodiment, the 
transgenic plant is sunflower, cotton, oil seed rape, 
maize, tobacco, Arabidopsis, peanut or soybean. Since 
progeny of transformed plants inherit the chimeric 
genes, seeds or cuttings from transformed plants are 
used to maintain the transgenic line. 

The following examples further illustrate 

the invention. 
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EXAMPLE 1 

Isolation of Membrane - Bound Polysomal 
RNA and Construction of Borage cDNA Library 

Membrane -bound polysomes were isolated from 
borage seeds 12 days post pollination (12 DPP) using 
the protocol established for peas by Larkins and 
Davies (1975 Plant Phys . 55: 749-756). RNA was 
extracted from the polysomes as described by Mechler 

(19 87 Methods in Enzymology 152: 241-248, Academic 
Press) . Poly-A + RNA was isolated from the membrane 
bound polysomal RNA using Oligotex-dT™ beads (Qiagen) . 

Corresponding cDNA was made using 
Stratagene' s ZAP cDNA synthesis kit. The cDNA library 
was constructed in the lambda ZAP II vector 

(Stratagene) using the lambda ZAP II kit. The primary 
library was packaged with Gigapack II Gold packaging 
extract (Stratagene) . 
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EXAMPLE 2 

Isolation of a A - 6 Desaturase cDNA from Borage 

Hybridization protocol 

The amplified borage cDNA library was plated 
at low density (500 pfu on 150 mm petri dishes) . 
Highly prevalent seed storage protein cDNAs were 
reduced (subtracted from the total cDNAs) by screening 
with the corresponding cDNAs . 

Hybridization probes for screening the 
borage cDNA library were generated by using random 
primed DNA synthesis as described by Ausubel et aJL 
(1994 Current Protocols in Molecular Biology , Wiley 
Interscience, N.Y.) and corresponded to previously 
identified abundantly expressed seed storage protein 
cDNAs . Unincorporated nucleotides were removed by use 
of a G-50 spin column (Boehringer Manheim) . Probe was 
denatured for hybridization by boiling in a water bath 
for 5 minutes, then quickly cooled on ice. 
Nitrocellulose filters carrying fixed recombinant 
bacteriophage were prehybridized at 60 °C for 2-4 hours 
in hybridization solution [4X SET (600 mM NaCl , 80 mM 
Tris-HCl, 4 mM Na 2 EDTA; pH 7.8), 5X Denhardt's reagent 
(0.1% bovine serum albumin, 0.1% Ficoll, and 0.1% 
polyvinylpyrolidone) , 10 0 yig/ml denatured salmon sperm 
DNA , 50 pg /ml polyadenine and 10 ug/ml polycytidine] . 
This was replaced with fresh hybridization solution to 
which denatured radioactive probe (2 ng/ml 
hybridization solution) was added. The filters were 
incubated at 60°C with agitation overnight. Filters 
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were washed sequentially in 4X, 2X, and IX SET (150 mM 
NaCl, 20 mM Tris-HCl, 1 mM Na 2 EDTA; pH7.8) for 15 
minutes each at 60°C. Filters were air dried and then 
exposed to X-ray film for 24 hours with intensifying 
screens at -80°C. 

Non-hybridizing plaques were excised using 
Stratagene's excision protocol and reagents. 
Resulting bacterial colonies were used to inoculate 
liquid cultures and were either sequenced manually or 
by an ABI automated sequencer. 

Random Sequencing of cDNAs from a Borage Seed 12 (DPP) 
Membrane - Bound Polysomal Library 

Each cDNA corresponding to a non- 

hybridizing plaque was sequenced once and a sequence 

tag generated from 200-300 base pairs. All sequencing 

was performed by cycle sequencing (Epicentre) . Over 

300 expressed sequence tags (ESTs) were generated. 

Each' sequence tag was compared to the GenBank database 

using the BLAST algorithm (Altschul et al . 1990 J. 

Mol. Biol. 215:403-410). A number of lipid metabolism 

genes, including the A6 - desaturase were identified. 

Database searches with the cDNA clone 

designated mbp-65 using BLASTX with the GenBank 

database resulted in a significant match to the 

previously isolated Synechocystis A6 - desaturase . It 

was determined however, that mbp-65 was not a full 

length cDNA. A full length cDNA was isolated using 

mbp-65 to screen the borage membrane -bound polysomal 

library. The resultant clone was designated pANl and 

the cDNA insert of pANl was sequenced by the cycle 
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sequencing method. The amino acid sequence deduced 
from the open reading frame (Fig. 1, SEQ ID NO:l) was 
compared to other known desaturases using Geneworks 
(IntelligGenetics) protein alignment program. This 
alignment indicated that the cDNA insert of pANl was 
the borage A6 - desaturase gene. 

The resulting dendrogram (Figure 2) shows 
that A 15 -desaturases and A 12 - desaturases comprise two 
groups. The newly isolated borage sequence and the 
previously isolated Synechocystis A 6 - desaturase (U.S. 
Patent No. 5,552,306) formed a third distinct group. 
A comparison of amino acid motifs common to 
desaturases and thought to be involved catalytically 
in metal binding illustrates the overall similarity of 
the protein encoded by the borage gene to desaturases 
in general and the Synechocystis A 6 - desaturase in 
particular (Table 1). At the same time, comparison of 
the motifs in Table 1 indicates definite differences 
between this protein and other plant desaturases. 
Furthermore, the borage sequence is also distinguished 
from known plant membrane associated fatty acid 
desaturases by the presence of a heme binding motif 
conserved in cytochrome b 5 proteins (Schmidt et al . 
1994 Plant Mol . Biol. 26:631 - 642 ) (Figure 1). Thus, 
while these results clearly suggested that the 
isolated cDNA was a borage A 6 - desaturase gene, further 
confirmation was necessary. To confirm the identity 
of the borage A6 - desaturase cDNA, the cDNA insert from 
pANl was cloned into an expression cassette for stable 
expression. The vector pBI121 (Jefferson et al . 1987 
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EMBO J. 6:3901-3907) was prepared for ligation by 
digestion with BamHI and EcoICR I (an isoschizomer of 
SacI which leaves blunt ends; available from Promega) 
which excises the GUS coding region leaving the 35S 
promoter and NOS terminator intact. The borage A 6 - 
desaturase cDNA was excised from the recombinant 
plasmid (pANl) by digestion with BamHI and Xhol . The 
Xhol end was made blunt by performing a fill-in 
reaction catalyzed by the Klenow fragment of DNA 
polymerase I. This fragment was then cloned into the 
BamHl/EcoICR I sites of pBI121.1, resulting in the 
plasmid pAN2 . 
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EXAMPLE 3 

Production of Transgenic 
Plants and Preparation and 
Analysis of Fatty Acid Methyl Esters (FAMES ) 

The expression plasmid, pAN2 was used to 
transform tobacco (Nicotiana tabacum cv. xanthi) via 
Agrobacterium tumefaciens according to standard 
procedures (Horsch, et al . 1985 Science 227:1229-1231; 
Bogue et al . 199 0 Mol . Gen. Genet, 221 : 49 -57) except 
that the initial transf ormants were selected on 100 
yug/ml kanamycin. 

Tissue from transgenic plants was frozen in 
liquid nitrogen and lyophilized overnight. FAMES were 
prepared as described by Dahmer, et al . (1989) J. 
Amer. Oil. Chem. Soc . 66: 543-548. In some cases, the 
solvent was evaporated again, and the FAMEs were 
resuspended in ethyl acetate and extracted once with 
deionized water to remove any water soluble 
contaminants. FAMEs were analyzed using a Tracor-560 
gas liquid chromatograph as previously described 
(Reddy et al . 1996 Nature Biotech. 14:639-642) . 

As shown in Figure. 3, transgenic tobacco 
leaves containing the borage cDNA produced both GLA 
and octadecatetraenoic acid (OTA) (18:4 A6,9,12,15). 
These results thus demonstrate that the isolated cDNA 
encodes a borage A6 - desaturase . 
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EXAMPLE 4 

Expression of A6 - desaturase in Borage 

The native expression of A6 - desaturase was 
examined by Northern Analysis of RNA derived from 
borage tissues. RNA was isolated from developing 
borage embryos following the method of Chang et al . 
1993 Plant Mol . Biol. Rep. 12:113-116. RNA was 
electrophoretically separated on formaldehyde- agarose 
gels, blotted to nylon membranes by capillary 
transfer, and immobilized by baking at 80°C for 30 
minutes following standard protocols (Brown T . , 1996 
in Current Protocols in Molecular Biology, eds . 
Auselbel, et al . [Greene Publishing and Wiley - 
Interscience, New York] pp. 4.9.1-4.9.14.). The 
filters were preincubated at 42°C in a solution 
containing 50% deionized formamide, 5X Denhardt's 
reagent, 5X SSPE (9 00 mM NaCl; 5,0mM Sodium phosphate, 
pH7.7; and 5 mM EDTA) , 0.1% SDS , and 20 0 ug/ml 
denatured salmon sperm DNA. After two hours, the 
filters were added to a fresh solution of the same 
composition with the addition of denatured radioactive 
hybridization probe. In this instance, the probes 
used were borage legumin cDNA (Fig. 16A) , borage 
oleosin cDNA (Fig. 16B) , and borage A6 - desaturase cDNA 
(pANl, Example 2) (Fig. 16C) . The borage legumin and 
oleosin cDNAs were isolated by EST cloning and 
identified by comparison to the GenBank database using 
the BLAST algorithm as described in Example 2. 
Loading variation was corrected by normalizing to 
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levels of borage EFla mRNA . EFla mRNA was identified 
by correlating to the corresponding cDNA obtained by 
the EST analysis described in Example 2. The filters 
were hybridized at 42 °C for 12-20 hours, then washed 
as described above (except that the temperature was 
65°C) , air dried, and exposed to X-ray film. 

As depicted in Figs. 15A and 15B, A6- 
desaturase is expressed primarily in borage seed. 
Borage seeds reach maturation between 18-20 days post 
pollination (dpp) . A6 - desaturase mRNA expression 
occurs throughout the time points collected (8-20 
dpp), but appears maximal from 10-16 days post 
pollination. This expression profile is similar to 
that seen for borage oleosin and 12S seed storage 
protein mRNAs (Figs. 16A, 16B, and 16C) . 



WO 98/45461 



PCT/US98/07179 



-35- 



EXAMPLE 5 

Isolation and Characterization of a Novel Oleosin cDNA 

The oleosin cDNA (AtS21) was isolated by 
virtual subtraction screening of an Arabidopsis 
developing seed cDNA library using a random primed 
polymerase chain reaction (RP-PCR) cDNA probe derived 
from root tissue. 

RNA PREPARATION 

Arabidopsis thaliana Landsberg erecta plants 
were grown under continuous illumination in a 
vermiculite/soil mixture at ambient temperature 

(22°C) . Siliques 2-5 days after flowering were 
dissected to separately collect developing seeds and 
silique coats. Inflorescences containing initial 
flower buds and fully opened flowers, leaves, and 
whole siliques one or three days after flowering were 
also collected. Roots were obtained from seedlings 
that had been grown in Gamborg B 5 liquid medium (GIBCO 
BRL) for two weeks. The seeds for root culture were 
previously sterilized with 50% bleach for five minutes 
and rinsed with water extensively. All tissues were 
frozen in liquid nitrogen and stored at -80°C until 
use. Total RNAs were isolated following a hot 
phenol/SDS extraction and LiCl precipitation protocol 

(Harris et al . 1978 Biochem. 17:3251-3256; Galau et 
al. 1981 J. Biol. Chem. 256": 2551 - 2560) . Poly A+ RNA 
was isolated using oligo dT column chromatography 
according to manufacturers' protocols (PHARMACIA or 
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STRATAGENE) or using oligotex-dT latex particles 
(QIAGEN) . 

Construction of tissue - specif ic cDNA libraries 

Flower, one day silique, three day silique, 
leaf, root, and developing seed cDNA libraries were 
each constructed from 5 ug poly A+ RN using the ZAP 
cDNA synthesis kit (Stratagene) . cDNAs were 
directionally cloned into the EcoRI and Xhol sites of 
pBluescript SK ( - ) in the A-ZAPII vector (Short et al . 
1988 Nucleic Acids Res. 16:7583-7600). Nonrecombinant 
phage plaques were identified by blue color 
development on NZY plates containing X-gal (5 bromo-4- 
chloro- 3 - indoyl - 3 -D-galactopyranoside) and IPTG 
(isopropyl - 1 - thio- 3 ■ D - galactopyranos ide) . The 
nonrecombinant backgrounds for the flower, one day 
silique, three day silique, leaf, root, and developing 
seed' cDNA libraries were 2.8%, 2%m 3.3%, 6.5%, 2.5%, 
and 1.9% respectively. 

Random priming DNA labeling 

The cDNA inserts of isolated clones 
(unhybridized cDNAs) were excised by EcoRI/XhoI double 
digestion and gel -purified for random priming 
labeling. Klenow reaction mixture contained 50 ng DNA 
templates, 10 mM Tris-HCl, pH 7.5, 5 mM MgCl 2 , 7.5 mM 
DTT, 5 0 uM each of dCTP, dGTP, and dTTP, 10 uM hexamer 
random primbers (Boehringer Mannheim) , 5 0 yCi a- 3 2 P- 
dATP, 3000 Ci/mmole, 10 mCi/ml (DuPont) , and 5 units 
of DNA polymerase I Klenow fragment (New England 



WO 98/45461 



PCT/US98/07179 



-37- 



Biolabs) . The reactions were carried out at 37°C for 
one hour. Aliquots of diluted reaction mixtures were 
used for TCA precipitation and alkaline denaturing gel 
analysis. Hybridization probes were labeled only with 
Klenow DNA polymerase and the unincorporated dNTPs 
were removed using Sephadex R G-50 spin columns 
(Boehringer Mannheim) . 

Random Primed PCR 

Double- stranded cDNA was synthesized from 
poly A+ RNA isolated from Arabldopsis root tissue 
using the cDNA Synthesis System (GIBCO BRL) with oligo 
dTl2-18 as primers. cDNAs longer than 3 00 bp were 
enriched by Sephacryl S-400 column chromatography 
(Stratagene) . Fractionated cDNAs were used as 
templates for RP-PCR labeling. The reaction contained 
10 mM Tris-HCl, ph 9.0, 50 mM KC1, 0.1% Triton X-100, 
2 mM'MgC12, 5 units Taq DNA polymeras (PROMEGA), 200 
UM dCTP, cGTP, and dTTP , and different concentrations 
of hexamer random primers a-32P dATP , 800 mCi/mmole, 
10 mCi/ml (DuPont) , and cold dATP in a final volume of 
25 ill . After an initial 5 minutes at 95°C, different 
reactions were run through different programs to 
optimize RP-PCR cDNA conditions. Unless otherwise 
indicated, the following program was used for most RP- 
PCR cDNA probe labeling: 95°C/5 minutes, then 40 
cycles of 95°C 30 seconds, 18°C/1 second, ramp to 30°C 
at a rate of 0 . l°C/second . 72°C/1 minute. RP-PCR 
products were phenol/chloroform extracted and ethanol 
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precipitated or purified by passing through Sephadex 
G-50 spin columns (Boehringer Mannheim). 

Clone blot virtual subtraction 

Mass excision of A- ZAP cDNA libraries was 
carried out by co- infecting XLl-Blue MRF' host cells 
with recombinant phage from the libraries and ExAssist 
helper phage (STRATAGENE) . Excised phagemids were 
rescued by SOLR cells. Plasmid DNAs were prepared by 
boiling mini-prep method (Holmes et al . 1981 Anal. 
Biochem. 114:193-197) from randomly isolated clones. 
cDNA inserts were excised by EcoRI and Xhol double 
digestion, and resolved on 1% agarose gels. The DNAs 
were denatured in 0.5 N NaOH and 1.5 m NaCl for 4 5 
minutes, neutralized in 0 . 5 M Tris-HCl, pH 8.0, and 
1.5 M NaCl for 45 minutes, and then transferred by 
blotting to nylon membranes (Micron Separations, Inc.) 
in TOX SSC overnight. After one hour prehybridization 
at 6 5°C, root RP-cDNA probe was added to the same 
hybridization buffer containing 1% bovine albumin 
fraction V (Sigma), 1 mM EDTA, 0.5 M NaHP04 , pH 7.2, 
7% SDS . The hybridization continued for 24 hours at 
65°C. The filters were washed in 0.5% bovine albumin, 
1 mM EDTA, 40 mM NaHP04 , pH 7 . 2 , 5% SDS for ten 
minutes at room temperature, and 3 x 10 minutes in 1 
mM EDTA, 4 0 mM NaHP04 , pH 7 . 2 , 1% SDS at 6 5°C. 
Autoradiographs were exposed to X-ray films (Kodak) 
for two to five days at -80°C. 

Hybridization of resulting blots with root 
RP-PCR probes "virtually subtracted" seed cDNAs shared 
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with the root mRNA population. The remaining seed 
cDNAs representing putative seed- specif ic cDNAs, 
including those encoding oleosins, were sequenced by 
the cycle sequencing method, thereby identifying AtS21 
as an oleosin cDNA clone. 

Sequence analysis of AtS21 

The oleosin cDNA is 834 bp long including an 
18 bp long poly A tail (Fig. 4, SEQ ID NO: 2) It has 
high homology to other oleosin genes from Arabidopsis 
as well as from other species. Recently, an identical 
oleosin gene has been reported (Zou, et al . , 1996, 
Plant Mol.Biol. 31:429-433). The predicted protein is 
191 amino acids long with a highly hydrophobic middle 
domain flanked by a hydrophilic domain on each side. 
The existence of two upstream in frame stop codons and 
the similarity to other oleosin genes indicate that 
this' cDNA is full-length. Since there are two in frame 
stop codons just upstream of the first ATG, this cDNA 
is considered to be a full length cDNA (Figure 4, SEQ 
ID NO: 2). The predicted protein has three distinctive 
domains based on the distribution of its amino acid 
residues. Both the N- terminal and C- terminal domains 
are rich in charged residues while the central domain 
is absolutely hydrophobic (Figure 5) . As many as 20 
leucine residues are located in the central domain and 
arranged as repeats with one leucine occurring every 
7-10 residues. Other non-polar amino acid residues 
are also clustered in the central domain making this 
domain absolutely hydrophobic (Figure 6) . 
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Extensive searches of different databases 
using both AtS21 cDNA and its predicted protein 
sequence identified oleosins from carrot, maize, 
cotton, rapeseed, Arabidopsis , and other plant 
species. The homology is mainly restricted to the 
central hydrophobic domain. Seven Arabidopsis oleosin 
sequences were found. AtS21 represents the same gene 
as Z54164 which has a few more bases in the 5' 
untranslated region. The seven Arabidopsis oleosin 
sequences available so far were aligned to each other 
(Figure 7) . The result suggested that the seven 
sequences fall into three groups. The first group 
includes AtS21 (SEQ ID NO: 5), X91918 (SEQ ID NO:6), 
and the partial sequence Z29859 (SEQ ID NO: 7) . Since 
X91918 ( SEQ ID NO: 6) has only its last residue 
different from AtS21 (SEQ ID NO:5), and since Z29859 
(SEQ ID NO: 7) has only three amino acid residues which 
are .different from AtS21 (SEQ ID NO:5), all three 
sequences likely represent the same gene. The two 
sequences of the second group, X62352 (SEQ ID NO: 8) 
and Atol3 (SEQ ID NO : 9 ) , are different in both 
sequence and length. Thus, there is no doubt that 
they represent two independent genes. Like the first 
group, the two sequences of the third group, X919 5 6 
(SEQ ID NO:10) and L40954 (SEQ ID NO:ll), also have 
only three divergent residues which may be due to 
sequence errors. Thus, X91956 (SEQ ID NO:10) and 
L40954 (SEQ ID NO: 11) likely represent the same gene. 
Unlike all the other oleosin sequences which were 
predicted from cDNA sequences, X62352 (SEQ ID NO: 8) 
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was deduced from a genomic sequence (Van Rooigen et 
al. 1992 Plant Mol. Biol. 18:1177-1179). In 
conclusion, four different Arabldopsis oleosin genes 
have been identified so far, and they are conserved 
only in the middle of the hydrophobic domain. 

Northern Analysis 

In order to characterize the expression 
pattern of the native AtS21 gene, Northern analysis 
was performed as described in Example 4 except that 
the probe was the AtS21 cDNA (pANl insert) labeled 
with 32 P-dATP to a specific activity of 5 x 10 8 cpm/ug . 

Results indicated that the AtS21 gene is 
strongly expressed in developing seeds and weakly 
expressed in silique coats (Figure 8A) . A much larger 
transcript, which might represent unprocessed AtS21 
pre-mRNA, was also detected in developing seed RNA. 
AtS21 was not detected in flower, leaf, root (Figure 
8A) , or one day silique RNAs . A different Northern 
analysis revealed that AtS21 is also strongly 
expressed in imbibed germinating seeds (Figs. 13A and 
13B) 



WO 98/45461 



-42- 



PCT7US98/07179 



EXAMPLE 6 

Characterization of Oleosin 
Genomic Clones and Isolation of Oleosin Promoter 

Genomic clones were isolated by screening an 
Arabidopsis genomic DNA library using the full length 
cDNA (AtS21)as a probe. Two genomic clones were 
mapped by restriction enzyme digestion followed by 
Southern hybridization using the 5 1 half of the cDNA 
cleaved by SacI as a probe. A 2 kb Sacl fragment was 
subcloned and sequenced (Fig. 9, SEQ ID NO:35). Two 
regions of the genomic clone are identical to the cDNA 
sequence. A 395 bp intron separates the two regions. 

The copy number of AtS21 gene in the 
Arabidopsis genome was determined by genomic DNA 
Southern hybridization following digestion with the 
enzymes BamHI , EcoRI , Hindlll, Sacl and Xbal, using 
the full length cDNA as a probe (Figure 8B) . A single 
band' was detected in all the lanes except Sacl 
digestion where two bands were detected. Since the 
cDNA probe has an internal Sacl site, these results 
indicated that AtS21 is a single copy gene in the 
Arabidopsis genome. Since it has been known that 
Arabidopsis genome contains different isoforms of 
oleosin genes, this Southern analysis also 
demonstrates that the different oleosin isoforms of 
Arabidopsis are divergent at the DNA sequence level. 

Two regions, separated by a 395 bp intron, 
of the genomic DNA fragment are identical to AtS21 
cDNA sequence. Database searches using the 5 1 
promoter sequence upstream of AtS21 cDNA sequence did 



WO 98/45461 



-43- 



PCT/US98/07179 



not identify any sequence with significant homology. 
Furthermore, the comparison of AtS21 promoter sequence 
with another Arabidopsis oleosin promoter isolated 
previously ( Van Rooijen, et al . , 1992) revealed 
little similarity. The AtS21 promoter sequence is 
rich in A/T bases, and contains as many as 44 direct 
repeats ranging from 10 bp to 14 bp with only one 
mismatch allowed. Two 14 bp direct repeats, and a 
putative ABA response element are underlined in Figure 
9 . 
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EXAMPLE 7 

Construction of AtS21 
Promoter/GUS Gene Expression Cassette and Expression 
Patterns in Transgenic Arabidopsis and Tobacco 



Construction of AtS21 promoter/GUS gene expression 
cassette 

The 1267 bp promoter fragment starting from 
the first G upstream of the ATG codon of the genomic 
DNA fragment was amplified using PCR and fused to the 
GUS reporter gene for analysis of its activity. 
The promoter fragment of the AtS21 genomic clone was 
amplified by PCR using the T7 primer 
GTAATACGACTCACTATAGGGC ( SEQ ID NO: 13) and the 2 IP 
primer GGGGATC C TAT AC T AAAAC T AT AG AGT AAAGG (SEQ ID NO: 14) 
complementary to the 5 1 untranslated region upstream 
of the first ATG codon (Figure 9) . A BamHI cloning 
site was introduced by the 21P primer. The amplified 
fragment was cloned into the BamHI and SacI sites of 
pBluescript KS (Stratagene) . Individual clones were 
sequenced to check possible PCR mutations as well as 
the orientation of their inserts. The correct clone 
was digested with BamHI and Hindlll, and the excised 
promoter fragment (1.3 kb) was cloned into the 
corresponding sites of pBHOl.l (Jefferson, R.A. 
1987a, Plant Mol . Biol. Rep. 5:387-405; Jefferson et 
al . , 1987b, EMBO J. £:3901-3907) upstream of the GUS 
gene. The resultant plasmid was designated pAN5 (Fig. 
10) . The AtS21 promoter/GUS construct (pAN5) was 
introduced into both tobacco (by the leaf disc method, 
Horsch et al . , 19 85; Bogue et al . 199 0 Mol. Gen. Gen. 
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221:49-57) and Arabidopsis Colombia ecotype via vacuum 
infiltration as described by Bechtold, et al . (1993) 
C.R. Acad. Sci. Paris, 316: 1194 - 1199 . Seeds were 
sterilized and selected on media containing 50 yug/ml 
kanamycin, 500 yug/ml carbenicillin . 
GUS activity assav : Expression patterns of the 
reporter GUS gene were revealed by his tochemical 
staining (Jefferson, et al . , 1987a, Plant Mol . Biol. 
Rep. 5:387-405). Different tissues were stained in 
substrate solution containing 2 mg/ml 5-bromo-4- 
chloro- 3 - indolyl - (3 - D - glucuronic acid (X-Gluc) 
(Research Organics, Inc.), 0.5 mM potassium 
f errocyanide, and 0.5 mM potassium ferricyanide in 50 
mM sodium phosphate buffer, pH 7.0 at 37°C overnight, 
and then dehydrated successively in 20%, 40% and 80% 
ethanol (Jefferson, et al . , 1987). Photographs were 
taken using an Axiophot (Zeiss) compound microscope or 
Olympus SZH10 dissecting microscope. Slides were 
converted to digital images using a Spring/Scan 35LE 
slide scanner (Polaroid) and compiled using Adobe 
Photoshop™ 3.0.5 and Canvas™ 3.5. 

GUS activities were quantitatively measured 
by fluorometry using 2 mM 4 -MUG (4 -methylumbellif eryl - 
3 -D-glucuronide) as substrate (Jefferson, et al . , 
1987) . Developing Arabidopsis seeds were staged 
according to their colors, and other plant tissues 
were collected and kept at -80°C until use. Plant 
tissues were ground in extraction buffer containing 50 
mM sodium phosphate, pH 7 . 0 , 10 mM EDTA, 10 mM 3- 
mercaptoethanol , 0.1% Triton X-100, and 0.1% sodium 
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lauryl sarcosine. The tissue debris was removed by 5 
minutes centrif ugation in a microfuge. The 
supernatant was aliquoted and mixed with substrate and 
incubated at 37 °C for 1 hour. Three replicas were 
assayed for each sample. The reactions were stopped 
by adding 4 volumes of 0.2 M sodium carbonate. 
Fluorescence was read using a TKO-100 DNA fluorometer 
(Hoefer Scientific Instruments) . Protein 
concentrations of the extracts were determined by the 
Bradford method (Bio Rad) . 

Expression pat terns of AtS21 promoter /GUS in 
transgenic Arabidovsis and tobacco 

In Arabidopsis , GUS activity was detected in 
green seeds, and node regions where siliques, cauline 
leaves and branches join the inflorescence stem 
(Figures HA and 11B) . No GUS activity was detected 
in any leaf, root, flower, silique coat, or the 
internode regions of the inflorescence stem. Detailed 
studies of the GUS expression in developing seeds 
revealed that the AtS21 promoter was only active in 
green seeds in which the embryos had already developed 
beyond heart stage (Figures lie and 11G) . The 
youngest embryos showing GUS activity that could be 
detected by his tochemical staining were at early 
torpedo stage. Interestingly, the staining was only 
restricted to the lower part of the embryo including 
hypocotyl and embryonic radical . No staining was 
detected in the young cotyledons (Figures 11D and 
HE) . Cotyledons began to be stained when the embryos 
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were at late torpedo or even early cotyledon stage 
(Figure 11F and 11H) . Later, the entire embryos were 
stained, and the staining became more intense as the 
embryos matured (Figures 111 and 11J) . It was also 
observed that GUS gene expression was restricted to 
the embryos . Seed coat and young endosperm were not 
stained (Figure 11C) . 

GUS activity was also detected in developing 
seedlings. Young seedlings of 3-5 days old were 
stained everywhere. Although some root hairs close to 
the hypocotyl were stained (Figure 11K) , most of the 
newly formed structures such as root hairs, lateral 
root primordia and shoot apex were not stained 
(Figures 11L and UN) . Later, the staining was 
restricted to cotyledons and hypocotyls when lateral 
roots grew from the elongating embryonic root. The 
staining on embryonic roots disappeared. No staining 
was observed on newly formed lateral roots, true 
leaves nor trichomes on true leaves (Figures 11M and 
UN) . 

AtS21 promoter/GUS expression patterns in 
tobacco are basically the same as in Arabidopsis . GUS 
activity was only detected in late stage seeds and 
different node regions of mature plants. In 
germinating seeds, strong staining was detected 
throughout the entire embryos as soon as one hour 
after they were dissected from imbibed seeds. Mature 
endosperm, which Arabidopsis seeds do not have, but 
not seed coat was also stained (Figure 12A) . The root 
tips of some young seedlings of one transgenic line 
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were not stained (Figure 12B) . Otherwise, GUS 
expression patterns in developing tobacco seedlings 
were the same as in Arabidopsis seedlings (Figures 
12B, 12C, and 12D) . Newly formed structures such as 
lateral roots and true leaves were not stained. 

AtS21 mRNA levels in developing seedlings 

Since the observed strong activities of 
AtS21 promoter/GUS in both Arabidopsis and tobacco 
seedlings are not consistent with the seed- specif ic 
expression of oleosin genes, Northern analysis was 
carried out to determine if AtS21 mRNA was present in 
developing seedlings where the GUS activity was so 
strong. RNAs prepared from seedlings at different 
stages from 24 hours to 12 days were analyzed by 
Northern hybridization using AtS21 cDNA as the probe. 
Surprisingly, AtS21 mRNA was detected at a high level 
comparable to that in developing seeds in 24-48 hour 
imbibed seeds. The mRNA level dropped dramatically 
when young seedlings first emerged at 74 hours 
(Figures 13A and 13B) . In 96 hour and older 
seedlings, no signal was detected even with a longer 
exposure (Figure 13B) . The loadings of RNA samples 
were checked by hybridizing the same blot with a 
tubulin gene probe (Figure 13C) which was isolated and 
identified by EST analysis as described in Example 2. 
Since AtS21 mRNA was so abundant in seeds, residual 
AtS21 probes remained on the blot even after extensive 
stripping. These results indicated that AtS21 mRNA 
detected in imbibed seeds and very young seedlings are 
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the carry-over of AtS21 mRNA from dry seeds. It has 
recently been reported that an oleosin Atol2 mRNA 
(identical to AtS21) is most abundant in dry seeds 
(Kirik, et al . , 1996 Plant Mol . Biol. 31 (2) ;413 -417 . ) 
Similarly, the strong GUS activities in seedlings were 
most likely due to the carry-over of both p- 
glucuronidase protein and the de novo synthesis of (3- 
glucuronidase from its mRNA carried over from the dry 
seed stage. 
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EXAMPLE 8 

Activity comparison between the 
AtS21 promoter and the 35S promoter 

The GUS activities in transgenic Arabidopsis 
developing seeds expressed by the AtS21 promoter were 
compared with those expressed by the 3 5S promoter in 
the construct pBI221 (Jefferson et al . EMBO J. 6;3901- 
39 07) . The seeds were staged according to their 
colors (Table 2), The earliest stage was from 
globular to late heart stage when the seeds were still 
white but large enough to be dissected from the 
siliques. AtS21 promoter activity was detected at a 
level about three times lower than that of the 35S 
promoter at this stage. 35S promoter activity 
remained at the same low level throughout the entire 
embryo development. In contrast, AtS21 promoter 
activity increased quickly as the embryos passed 
torpedo stage and reached the highest level of 25.25 
pmole 4-MU/min. jug protein at mature stage (Figure 5- 
8) . The peak activity of the AtS21 promoter is as 
much as 210 times higher than its lowest activity at 
globular to heart stage, and is close to 100 times 
higher than the 35S promoter activity at the same 
stage (Table 2) . The activity levels of the AtS21 
promoter are similar to those of another Arabidopsis 
oleosin promoter expressed in Brassica napus (Plant et 
al. 1994, Plant mol . Biol. 25:193-205. AtS21 promoter 
activity was also detected at background level in 
leaf. The high standard deviation, higher than the 
average itself, indicated that the GUS activity was 
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only detected in the leaves of some lines (Table 2). 
On the other hand, 35S promoter activity in leaf was 
more than 20 times higher than that in seed. The side 
by side comparisons of activities between AtS21 
promoter and 35S promoter is shown in Figure 14. 

Although the AtS21 promoter activity was 
about 3 times lower in dry seed of tobacco than in 
Arabidopsis dry seed, the absolute GUS activity was 
still higher than that expressed by the 35S promoter 
in Arabidopsis leaf (Table 2) . No detectable AtS21 
promoter activity was observed in tobacco leaf (Figure 
14) . 

Comparison of the AtS21 promoter versus the 
35S promoter revealed that the latter is not a good 
promoter to express genes at high levels in developing 
seeds. Because of its consistent low activities 
throughout the entire embryo development period, 3 5S 
promoter is useful for consistent low level expression 
of target genes. On the other hand, the AtS21 
promoter is a very strong promoter that can be used to 
express genes starting from heart stage embryos and 
accumulating until the dry seed stage. The 35S 
promoter, although not efficient, is better than the 
AtS21 promoter in expressing genes in embryos prior to 
heart stage. 



WO 98/45461 _ 52 _ PCT/US98/07179 



10 



15 



CN 

w 
r-H 4 

Eh 



20 



25 



30 



co 
En 
o 

Eh 

CO 

53 
O 
O 

CO 

I 

W 
Eh 

I 

co 

CO 

rd 



CN] 
CO 
4-> 
< 

O 
CO 

w 

H 

e 

H 
> 
M 
rH 

u 



CO 

8 



W 
►J 



o ua 



o 

PQ 

^ Eh 
W 

w 
o 



CD W 
Eh 

r 



o 

W 

W 
Eh 
H 



Eh W 



CD 



35 



8 

o 

o 
o 



o 


o 


o 


+1 


+1 


+1 


CO 




tH 


o 


LD 


O 


o 




o 


i_n 






CO 








CN 




o 


O 


CN 


< — 1 






+1 


o 


O 


CO 


+1 


+1 


CO 


<H 


i — 1 




CO 


CO 








CN 


o 


CO 



+1 

LO 



LO 
CN 



O 

o 
+1 

CN 



in 

CN 



+1 
CO 
CO 



w 


LO 




w a 


1> 


CO 






o 


CD 


CO 






+1 


o 


Eh 




+1 


w a 


CTi 


CO 


IG 
E 


CO 


CN 




5— i 


o 



o 

+1 

CN 



r> 


co 




m 


o 




t — i 


o 




+1 


+1 




LO 


LD 




CO 


CN 




rH 


o 




t> 






rH 


o 








o 


O 


o 


u 


+ 1 


+1 


u 


CN 


o 


cd 


i-t 


CO 


a a 






M O 


O 


O 





CN 
00 



CM 
CO 
4-) 



0) 
4J 

• CD 

CD rH 
CT rd 

rd 



CO 

a 
o 

a) 

rH 

>i 

o 
u 



}H 

O CD 

MH C 
-H 

rH 

0) 

>i <D 
rd £ 
co rd 



CO 
CD 

rd 

4J 

o 



CO 



- a) 
o 



LO 
-P 
cd -P 

a) rd 

a cd 

CD Qa 



CD 

rd 

4> 
CO 



a 
o 

H (D 
Qi CD 

rH ^ 
rH 



CM 
O CO 

-P 

CD r< • 

CO 

U U C 
O O O 
4-J Ci-i -H 
4-1 

rd 



CD 
C7» 
rd 
4-> 
CO 

4-> 

rH 

rd 
CD 



> 
CD 

rH 

rd 
rd 



CD 

4J 

O 
U 

P4 -P 

CO 

\ -P 

►D -H 



CD 
M 

CD 
CD 
U 
& 
4-> 

O 

CD 
Cn 
rd 
H 
CD 
> 
rd 

CD 

rC-H 

CD 

rH 

rd 

CO 

u 

CD 



CD ^ 



rd 
co 

rH 

rd 

rH 

O 



CD 

rH • 

o 

a 



CD 
H 
D> rd 



3 
CS 

CD 

.a 
-p 

u 

CD 

o 
e 
o 

rH 



a 



CO 

o 

-H 

4-> 

rd 
■H 

> 
CD 

3 



CD 

C 
CD 
Ph Ph 
CD 

^ CO 



4-) CD 

-H > 

> -H 

-H lm 
4-J 

O UH 

rd O 



CO 

a 
o 

•H 

4J 

rd 
-H 

> 
CD 



CD -H 

tJ> rH 

rd 

H rC| 

CD U 

> rd 

rd CD 



U 

O 

fin tJ 
M 
rd 
• T> 

cd a 

a rd 
-H -P 

rH CO 



rCl 
4-J 



WO 98/45461 



PCT/US98/07179 



-53- 



EXAMPLE 9 

Expression of the Borage A 6 -Desaturase Gene Under 
the Control of the AtS21 Promoter and Comparison to 
Expression Under the Control of the CaMV 35 S Promoter 

In order to create an expression construct 
with the AtS21 promoter driving expression of the 
borage A6 - desaturase gene, the GUS coding fragment 
from pAN5 was removed by digestion with Smal and 
EcoICR I. The cDNA insert of pANl (Example 2) was then 
excised by first digesting with Xhol (and filling in 
the residual overhang as above) , and then digesting 
with Smal. The resulting fragment was used to replace 
the excised portion of pAN5 , yielding pAN3 . 

After transformation of tobacco and 
Arabidopsis following the methods of Example 7, levels 
of A 6 - desaturase activity were monitored by assaying 
the corresponding fatty acid methyl esters of its 
reaction products, ylinolenic acid (GLA) and 
octadecatetraenoic acid (OTA) using the methods 
referred to in Example 3. The GLA and OTA levels 

(Table 3) of the transgenic seeds ranged up to 6.7% of 
C18 fatty acids (Mean = 3.1%) and 2.8% (Mean = 1.1%), 
respectively. No GLA or OTA was detected in the 
leaves of these plants. In comparison, CaMV 35 S 
promoter/A 6 - desaturase transgenic plants produced GLA 
levels in seeds ranging up to 3.1% of C18 fatty acids 

(Mean = 1.3%) and no measurable OTA in seeds. 
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EXAMPLE 10 

Transformation of Oilseed Rape With an Expression 
Cassette Which Comprises the Oleosin 5' Regulatory- 
Region Linked to the Borage Delta 6-Desaturase Gene 

Oilseed rape, Cv . Westar, was transformed 
with the strain of Agrobacterium tumefaciens EHA105 
containing the plasmid pAN3 (i.e. the borage A6- 
desaturase gene under the control of the Arabidopsis 
oleosin promoter -Example 9) . 

Terminal internodes of Westar were co- 
cultivated for 2-3 days with induced Agrobacterium 
tumefaciens strain EHA105 (Alt-Moerbe et al . 1988 Mol . 
Gen. Genet. 213:1-8; James et al . 1993 Plant Cell 
Reports 12:559-563), then transferred onto 
regeneration medium (Boulter etal . 1990 Plant Science 
70:91-99; Fry et al . 1987 Plant Cell Reports 6:321- 
325) . The regenerated shoots were transferred to 
growth medium (Pelletier et al . 19 83 Mol. Gen. Menet . 
191:244-250), and a polymerase chain reaction (PCR) 
test was performed on leaf fragments to assess the 
presence of the gene. 

DNA was isolated from the leaves according 
to the protocol of KM Haymes et al . (1996) Plant 
Molecular Biology Reporter 14(3) : 280 -284, and 
resuspended in 100p.l of water, without RNase 
treatment. 5\il of extract were used for the PCR 
reaction, in a final volume of 50]il. The reaction was 
performed in a Perkin-Elmer 9600 thermocycler , with 
the following cycles: 
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1 cycle: 95°C, 5 minutes 

30 cycles: 95°C, 45 sec; 52°C, 45 sec 

72°C, 1 minute 

1 cycle: 72°C / 5 minutes 

and the following primers (derived from near the metal 
box regions, as indicated in Fig. 1, SEQ. NO.:l): 
5 1 TGG AAA TGG AAC CAT AA 3 1 
5 » GGA AAC AAA TGA TGC TC 3 ' 

Amplification of the DNA revealed the expected 549 
base pair PCR fragment (Figure 17) . 

The positive shoots were transferred to 
elongation medium, then to rooting medium (DeBlock et 
al 1989 Plant Physiol. 91 : 694 - 701) . Shoots with a well- 
developed root system were transferred to the 
greenhouse. When plants were well developed, leaves 
were collected for Southern analysis in order to 
assess gene copy number. 

Genomic DNA was extracted according to the 
procedure of Bouchez et al . (1996) Plant Molecular 
Biology Reporter 14:115-123, digested with the 
restriction enzymes Bgl I and/or Cla I, 

electrophoretically separated on agarose gel (Maniatis 
et al . 1982, in Molecular Cloning; a Laboratory 
Manual. Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor/NY) , and prepared for transfer to nylon 
membranes (Nytran membrane, Schleicher & Schuell) 
according to the instructions of the manufacturer. 
DNA was then transferred to membranes overnight by 
capillary action using 20XSSC (Maniatis et al . 1982). 
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Following transfer, the membranes were crosslinked by 
UV (Stratagene) for 30 seconds and pre-hybridized for 
1 hour at 65°C in 15 ml of a solution containing 
6XSSC, 0.5%SDS and 2.2 5% w/w dehydrated skim milk in 
glass vials in hybridization oven (Appligene) . The 
membranes were hybridized overnight in the same 
solution containing a denatured hybridization probe 
radiolabelled with 32 P to a specific activity of 10 8 
cpm/ug by the random primer method (with the Ready- To - 
Go kit obtained from Pharmacia) . The probe represents 
a PCR fragment of the borage delta 6-desaturase gene 
(obtained in the conditions and with the primers 
detailed above) . After hybridization, the filters 
were washed at 65°C in 2XSSC, 0.1% SDS for 15 minutes, 
and 0.2XSSC, 0 . 1%SDS for 15 minutes. The membranes 
were then wrapped in Saran-Wrap and exposed to Kodak 
XAR film using an intensifying screen at -70°C in a 
light-proof cassette. Exposure time was generally 3 
days . 

The results obtained confirm the presence of 
the gene. According to the gene construct, the number 
of bands in each lane of DNA digested by Bgl I or Cla 
I represents the number of delta 6-desaturase genes 
present in the genomic DNA of the plant. The 
digestion with Bgl 1 and Cla 1 together generates a 
fragment of 3435 bp. 

The term "comprises" or "comprising" is 
defined as specifying the presence of the stated 
features, integers, steps, or components as referred to 
in the claims, but does not preclude the presence or 
addition of one or more other features, integers, steps, 
components, or groups thereof. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Rhone Poulenc Agro 
Thomas, Terry L . 
Li, Zhongsen 

(ii) TITLE OF INVENTION: AN OLEOSIN 5' REGULATORY REGION FOR THE 

MODIFICATION OF PLANT SEED LIPID COMPOSITION 

(iii) NUMBER OF SEQUENCES: 3 5 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Scully, Scott, Murphy & Presser 

(B) STREET: 400 Garden City Plaza 

(C) CITY: Garden City 

(D) STATE: New York 

(E) COUNTRY: USA 

(F) ZIP: 11530 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC - DOS/MS - DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.3 0 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/831,575 

(B) FILING DATE: 9 April 1997 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: DiGiglio, Frank S. 

(B) REGISTRATION NUMBER: 31,346 

(C) REFERENCE/DOCKET NUMBER: 10203 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (516) 742-4343 

(B) TELEFAX: (516) 742-4366 

(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1684 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 43.. 1387 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

ATATCTGCCT ACCCTCCCAA AGAGAGTAGT CATTTTTCAT CA ATG GCT GCT CAA 54 

Met Ala Ala Gin 
1 

ATC AAG AAA TAC ATT ACC TCA GAT GAA CTC AAG AAC CAC GAT AAA CCC 102 
lie Lys Lys Tyr He Thr Ser Asp Glu Leu Lys Asn His Asp Lys Pro 
5 10 15 ~ 20 

GGA GAT CTA TGG ATC TCG ATT CAA GGG AAA GCC TAT GAT GTT TCG GAT 150 
Gly Asp Leu Trp He Ser He Gin Gly Lys Ala Tyr Asp Val Ser Asp 

25 30 35 

TGG GTG AAA GAC CAT CCA GGT GGC AGC TTT CCC TTG AAG AGT CTT GCT 19 8 

Trp Val Lys Asp His Pro Gly Gly Ser Phe Pro Leu Lys Ser Leu Ala 
40 45 " 50 

GGT CAA GAG GTA ACT GAT GCA TTT GTT GCA TTC CAT CCT GCC TCT AC A 246 
Gly Gin Glu Val Thr Asp Ala Phe Val Ala Phe His Pro Ala Ser Thr 
55 60 65 

TGG AAG AAT CTT GAT AAG TTT TTC ACT GGG TAT TAT CTT AAA GAT TAC 294 
Trp Lys Asn Leu Asp Lys Phe Phe Thr Gly Tyr Tyr Leu Lys Asp Tyr 
70 75 80 

TCT GTT TCT GAG GTT TCT AAA GAT TAT AGG AAG CTT GTG TTT GAG TTT 342 
Ser Val Ser Glu Val Ser Lys Asp Tyr Arg Lys Leu Val Phe Glu Phe 
85 90 95 100 

TCT AAA ATG GGT TTG TAT GAC AAA AAA GGT CAT ATT ATG TTT GCA ACT 390 
Ser Lys Met Gly Leu Tyr Asp Lys Lys Gly His He Met Phe Ala Thr 

105 110 H5 

TTG TGC TTT ATA GCA ATG CTG TTT GCT ATG AGT GTT TAT GGG GTT TTG 43 8 

Leu Cys Phe He Ala Met Leu Phe Ala Met Ser Val Tyr Gly Val Leu 
120 125 130 

TTT TGT GAG GGT GTT TTG GTA CAT TTG TTT TCT GGG TGT TTG ATG GGG 486 
Phe Cys Glu Gly Val Leu Val His Leu Phe Ser Gly Cys Leu Met Glv 
135 140 145 

TTT CTT TGG ATT CAG AGT GGT TGG ATT GGA CAT GAT GCT GGG CAT TAT 53 4 

Phe Leu Trp He Gin Ser Gly Trp He Gly His Asp Ala Gly His Tyr 
150 155 160 

ATG GTA GTG TCT GAT TCA AGG CTT AAT AAG TTT ATG GGT ATT TTT GCT 582 
Met Val Val Ser Asp Ser Arg Leu Asn Lys Phe Met Gly He Phe Ala 
165 170 175 " 180 

GCA AAT TGT CTT TCA GGA ATA AGT ATT GGT TGG TGG AAA TGG AAC CAT 63 0 

Ala Asn Cys Leu Ser Gly He Ser He Gly Trp Trp Lys Trp Asn His 

185 190 195 
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AAT GCA CAT CAC ATT GCC TGT AAT AGC CTT GAA TAT GAC CCT GAT TTA 678 
Asn Ala His His lie Ala Cys Asn Ser Leu Glu Tyr Asp Pro Asp Leu 
200 205 210 

CAA TAT ATA CCA TTC CTT GTT GTG TCT TCC AAG TTT TTT GGT TCA CTC 72 6 

Gin Tyr lie Pro Phe Leu Val Val Ser Ser Lys Phe Phe Gly Ser Leu 
215 220 225 

ACC TCT CAT TTC TAT GAG AAA AGG TTG ACT TTT GAC TCT TTA TCA AGA 774 
Thr Ser His Phe Tyr Glu Lys Arg Leu Thr Phe Asp Ser Leu Ser Aro 
230 235 240 

TTC TTT GTA AGT TAT CAA CAT TGG ACA TTT TAC CCT ATT ATG TGT GCT 822 
Phe Phe Val Ser Tyr Gin His Trp Thr Phe Tyr Pro lie Met Cvs Ala 
245 250 255 260 

GCT AGG CTC AAT ATG TAT GTA CAA TCT CTC ATA ATG TTG TTG ACC AAG 87 0 

Ala Arg Leu Asn Met Tyr Val Gin Ser Leu He Met Leu Leu Thr Lys 

265 270 275 

AGA AAT GTG TCC TAT CGA GCT CAG GAA CTC TTG GGA TGC CTA GTG TTC 918 
Arg Asn Val Ser Tyr Arg Ala Gin Glu Leu Leu Gly Cys Leu Val Phe 
280 285 J 290 

TCG ATT TGG TAC CCG TTG CTT GTT TCT TGT TTG CCT AAT TGG GGT GAA 966 
Ser He Trp Tyr Pro Leu Leu Val Ser Cys Leu Pro Asn Trp Glv Glu 
295 300 ~ 305 

AGA ATT ATG TTT GTT ATT GCA AGT TTA TCA GTG ACT GGA ATG CAA CAA 1014 
Arg He Met Phe Val He Ala Ser Leu Ser Val Thr Gly Met Gin Gin 
310 315 320 

GTT CAG TTC TCC TTG AAC CAC TTC TCT TCA AGT GTT TAT GTT GGA AAG 1062 
Val Gin Phe Ser Leu Asn His Phe Ser Ser Ser Val Tyr Val Glv Lvs 
325 330 335 340 

CCT AAA GGG AAT AAT TGG TTT GAG AAA CAA ACG GAT GGG ACA CTT GAC 1110 
Pro Lys Gly Asn Asn Trp Phe Glu Lys Gin Thr Asp Gly Thr Leu Asp 

345 350 ~ 355 

ATT TCT TGT CCT CCT TGG ATG GAT TGG TTT CAT GGT GGA TTG CAA TTC 1158 
He Ser Cys Pro Pro Trp Met Asp Trp Phe His Gly Gly Leu Gin Phe 
360 365 370 

CAA ATT GAG CAT CAT TTG TTT CCC AAG ATG CCT AGA TGC AAC CTT AGG 1206 
Gin He Glu His His Leu Phe Pro Lys Met Pro Arg Cys Asn Leu Arg 
375 380 385 

AAA ATC TCG CCC TAC GTG ATC GAG TTA TGC AAG AAA CAT AAT TTG CCT 12 54 

Lys He Ser Pro Tyr Val He Glu Leu Cys Lys Lys His Asn Leu Pro 
390 395 400 

TAC AAT TAT GCA TCT TTC TCC AAG GCC AAT GAA ATG ACA CTC AGA ACA 13 02 

Tyr Asn Tyr Ala Ser Phe Ser Lys Ala Asn Glu Met Thr Leu Arg Thr 
405 410 415 420 
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TTG AGG AAC AC A GCA TTG CAG GCT AGG GAT ATA ACC AAG CCG CTC CCG 13 5 0 

Leu Arg Asn Thr Ala Leu Gin Ala Arg Asp He Thr Lys Pro Leu Pro 

425 430 435 

AAG AAT TTG GTA TGG GAA GCT CTT CAC ACT CAT GGT T AAAATTACCC 1397 
Lys Asn Leu Val Trp Glu Ala Leu His Thr His Gly 
440 445 

TTAGTTCATG TAATAATTTG AGATTATGTA TCTCCTATGT TTGTGTCTTG TCTTGGTTCT 1457 

ACTTGTTGGA GTCATTGCAA CTTGTCTTTT ATGGTTTATT AGATGTTTTT TAATATATTT 1517 

TAGAGGTTTT GCTTTCATCT CCATTATTGA TGAATAAGGA GTTGCATATT GTCAATTGTT 1577 

GTGCTCAATA TCTGATATTT TGGAATGTAC TTTGTACCAC GTGGTTTTCA GTTGAAGCTC 1637 

ATGTGTACTT CTATAGACTT TGTTTAAATG GTTATGTCAT GTTATTT 1 684 

(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 5 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

Met Ala Ala Gin He Lys Lys Tyr He Thr Ser Asp Glu Leu Lys Asn 
15 10 15 

His Asp Lys Pro Gly Asp Leu Trp He Ser He Gin Gly Lys Ala Tvr 
20 25 ' 30 

Asp Val Ser Asp Trp Val Lys Asp His Pro Gly Gly Ser Phe Pro Leu 

35 40 '45 

Lys Ser Leu Ala Gly Gin Glu Val Thr Asp Ala Phe Val Ala Phe His 
50 55 60 

Pro Ala Ser Thr Trp Lys Asn Leu Asp Lys Phe Phe Thr Gly Tyr Tyr 
65 70 75 ' 80 

Leu Lys Asp Tyr Ser Val Ser Glu Val Ser Lys Asp Tyr Arg Lys Leu 

85 90 95 

Val Phe Glu Phe Ser Lys Met Gly Leu Tyr Asp Lys Lys Gly His He 
100 105 110 

Met Phe Ala Thr Leu Cys Phe He Ala Met Leu Phe Ala Met Ser Val 
115 120 125 ' 

Tyr Gly Val Leu Phe Cys Glu 
130 135 
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(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 83 4 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 31.. 603 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

TTAGCCTTTA CTCTATAGTT TTAGATAGAC ATG GCG AAT GTG GAT CGT GAT CGG 54 

Met Ala Asn Val Asp Arg Asp Arg 

140 

CGT GTG CAT GTA GAC CGT ACT GAC AAA CGT GTT CAT CAG CCA AAC TAC 102 
Arg Val His Val Asp Arg Thr Asp Lys Arg Val His Gin Pro Asn Tyr 
145 150 155 

GAA GAT GAT GTC GGT TTT GGT GGC TAT GGC GGT TAT GGT GCT GGT TCT 150 
Glu Asp Asp Val Gly Phe Gly Gly Tyr Gly Gly Tyr Gly Ala Gly Ser 
160 165 170 " 175 

GAT TAT AAG AGT CGC GGC CCC TCC ACT AAC CAA ATC TTG GCA CTT ATA 19 8 

Asp Tyr Lys Ser Arg Gly Pro Ser Thr Asn Gin lie Leu Ala Leu lie 

180 185 190 

GCA GGA GTT CCC ATT GGT GGC AC A CTG CTA ACC CTA GCT GGA CTC ACT 246 
Ala Gly Val Pro lie Gly Gly Thr Leu Leu Thr Leu Ala Gly Leu Thr 
195 200 205 

CTA GCC GGT TCG GTG ATC GGC TTG CTA GTC TCC ATA CCC CTC TTC CTC 294 
Leu Ala Gly Ser Val lie Gly Leu Leu Val Ser lie Pro Leu Phe Leu 
210 215 220 

CTC TTC AGT CCG GTG ATA GTC CCG GCG GCT CTC ACT ATT GGG CTT GCT 3 42 

Leu Phe Ser Pro Val lie Val Pro Ala Ala Leu Thr lie Gly Leu Ala 
225 230 235 

GTG ACG GGA ATC TTG GCT TCT GGT TTG TTT GGG TTG ACG GGT CTG AGC 3 90 

Val Thr Gly lie Leu Ala Ser Gly Leu Phe Gly Leu Thr Gly Leu Ser 
240 245 250 255 

TCG GTC TCG TGG GTC CTC AAC TAC CTC CGT GGG ACG AGT GAT ACA GTG 43 8 

Ser Val Ser Trp Val Leu Asn Tyr Leu Arg Gly Thr Ser Asp Thr Val 

260 265 270 
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CCA GAG CAA TTG GAC TAC GCT AAA CGG CGT ATG GCT GAT GCG GTA GGC 48 6 

Pro Glu Gin Leu Asp Tyr Ala Lys Arg Arg Met Ala Asp Ala Val Gly 
275 280 285 

TAT GCT GGT ATG AAG GGA AAA GAG ATG GGT CAG TAT GTG CAA GAT AAG 53 4 

Tyr Ala Gly Met Lys Gly Lys Glu Met Gly Gin Tyr Val Gin Asp Lvs 
290 295 300 

GCT CAT GAG GCT CGT GAG ACT GAG TTC ATG ACT GAG ACC CAT GAG CCG 582 
Ala His Glu Ala Arg Glu Thr Glu Phe Met Thr Glu Thr His Glu Pro 
305 310 315 

GGT AAG GCC AGG AGA GGC TCA TAAGCTAATA TAAATTGCGG GAGTCAGTTG 633 
Gly Lys Ala Arg Arg Gly Ser 
320 325 

GAAACGCGAT AAATGTAGTT TTACTTTTAT GTCCCAGTTT CTTTCCTCTT TTAAGAATAT 693 

CTTTGTCTAT ATATGTGTTC GTTCGTTTTG TCTTGTCCAA ATAAAAATCC TTGTTAGTGA 753 

AATAAGAAAT GAAATAAATA TGTTTTCTTT TTTGAGATAA CCAGAAATCT CATACTATTT 813 

TCTAAAAAAA AAAAAAAAAA A 834 

( 2 ) INFORMATION FOR ' SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

Met Ala Asn Val Asp Arg Asp Arg Arg Val His Val Asp Arg Thr Asp 
1 5 10 15 

Lys Arg Val His Gin Pro Asn Tyr Glu Asp Asp Val Gly Phe Glv Glv 
20 25 ~ ' 30 

Tyr Gly Gly Tyr Gly Ala Gly Ser Asp Tyr Lys Ser Arg Gly Pro Ser 
35 40 45 

Thr Asn Gin lie Leu Ala Leu lie Ala Gly Val Pro lie Gly Gly Thr 
50 55 60 

Leu Leu Thr Leu Ala Gly Leu Thr Leu Ala Gly Ser Val lie Gly Leu 
65 70 75 ~ 80 

Leu Val Ser lie Pro Leu Phe Leu Leu Phe Ser Pro Val lie Val Pro 

85 90 95 



WO 98/45461 



PCTYUS98/07179 



-64- 

Ala Ala Leu Thr lie Gly Leu Ala Val Thr Gly lie Leu Ala Ser Gly 
100 105 110 

Leu Phe Gly Leu Thr Gly Leu Ser Ser Val Ser Trp Val Leu Asn Tyr 
115 120 125 

Leu Arg Gly Thr Ser Asp Thr Val Pro Glu Gin Leu Asp Tyr Ala Lys 
130 135 140 

Arg Arg Met Ala Asp Ala Val Gly Tyr Ala Gly Met Lys Gly Lys Glu 
145 150 155 160 

Met Gly Gin Tyr Val Gin Asp Lys Ala His Glu Ala Arg Glu Thr Glu 

165 170 175 

Phe Met Thr Glu Thr His Glu Pro Gly Lys Ala Arg Arg Gly Ser 
180 185 190 

(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 

Met Ala Asn Val Asp Arg Asp Arg Arg Val His Val Asp Arg Thr Asp 
15 10 15 

Lys Arg Val His Gin Pro Asn Tyr Glu Asp Asp Val Gly Phe Gly Gly 
20 25 30 

Thr Gly Gly Thr Gly Ala Gly Ser Asp Tyr Lys Ser Arg Gly Pro Ser 
35 40 45 

Thr Asn Gin lie Leu Ala Leu lie Ala Gly Val Pro lie Gly Gly Thr 
50 55 60 

Leu lie Thr Leu Ala Gly Leu Thr Leu Ala Gly Ser Val lie Gly Leu 
65 70 75 80 

Leu Val Ser lie Pro Leu Phe Leu lie Phe Ser Pro Val lie Val Pro 

85 90 95 

Ala Ala Leu Thr lie Gly Leu Ala Val Thr Gly lie Leu Ala Ser Gly 
100 105 110 

Leu Phe Gly Leu Thr Gly Leu Ser Ser Val Ser Trp Val Leu Asn Tyr 
115 120 125 
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Leu Arg Gly Thr Ser Asp Thr Val Pro Glu Gin Leu Asp Tyr Ala Lys 
130 135 140 

Arg Arg Met Ala Asp Ala Val Gly Tyr Ala Gly Met Lys Gly Lys Glu 
145 150 155 160 

Met Gly Gin Tyr Val Gin Asp Lys Ala His Glu Ala Arg Glu Thr Glu 

165 170 175 

Phe Met Thr Glu Thr His Glu Pro Gly Lys Ala Arg Arg Gly Ser 
180 185 190 

INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

Met Ala Asn Val Asp Arg Asp Arg Arg Val His Val Asp Arg Thr Asp 
15 10 15 

Lys Arg Val His Gin Pro Asn Tyr Glu Asp Asp Val Gly Phe Gly Gly 
20 25 30 

Thr Gly Gly Thr Gly Ala Gly Ser Asp Tyr Lys Ser Arg Gly Pro Ser 
3 5 40 45 

Thr Asn Gin lie Leu Ala Leu lie Ala Gly Val Pro lie Gly Gly Thr 
50 55 60 

Leu He Thr Leu Ala Gly Leu Thr Leu Ala Gly Ser Val He Gly Leu 
65 70 75 80 

Leu Val Ser lie Pro Leu Phe Leu He Phe Ser Pro Val He Val Pro 

85 90 95 

Ala Ala Leu Thr He Gly Leu Ala Val Thr Gly lie Leu Ala Ser Gly 
100 105 110 

Leu Phe Gly Leu Thr Gly Leu Ser Ser Val Ser Trp Val Leu Asn Tyr 
115 120 125 

Leu Arg Gly Thr Ser Asp Thr Val Pro Glu Gin Leu Asp Tyr Ala Lys 
130 135 140 

Arg Arg Met Ala Asp Ala Val Gly Tyr Ala Gly Met Lys Gly Lys Glu 
145 150 155 " 160 
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Met Gly Gin Tyr Val Gin Asp Lys Ala His Glu Ala Arg Glu Thr Glu 

165 170 175 

Phe Met Thr Glu Thr His Glu Pro Gly Lys Ala Arg Arg Gly Pro 
180 185 190 

(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 78 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

Phe Gly Leu Thr Gly Leu Ser Ser Val Ser Trp Val Leu Gin Leu Pro 
15 10 15 

Pro Trp Ala Ser Asp Thr Val Pro Glu Gin Val Asp Tyr Ala Lys Arg 
20 25 30 

Arg Met Ala Asp Ala Val Gly Tyr Ala Gly Met Lys Gly Lys Glu Met 
35 40 45 

Gly Gin Tyr Val Gin Asp Lys Ala His Glu Ala Arg Glu Thr Glu Phe 
50 55 60 

Met Thr Glu Thr His Glu Pro Gly Lys Ala Arg Arg Gly Ser 
65 70 75 

(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 173 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 

Met Ala Asp Thr Ala Arg Gly Thr His His Asp lie lie Gly Arg Asp 
15 10 15 

Gin Tyr Pro Met Met Gly Arg Asp Arg Asp Gin Tyr Gin Met Ser Gly 
20 25 30 

Arg Gly Ser Asp Tyr Ser Lys Ser Arg Gin lie Ala Lys Ala Ala Thr 
35 40 45 
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Ala Val Thr Ala Gly Gly Ser Leu Leu Val Leu Ser Ser Leu Thr Leu 
50 55 60 

Val Gly Thr Val Leu Ala Leu Thr Val Ala Thr Pro Leu Leu Val Leu 
65 70 75 80 

Phe Ser Pro He Leu Val Pro Ala Leu He Thr Val Ala Leu Leu He 

85 90 95 

Thr Gly Phe Leu Ser Ser Gly Gly Phe Gly He Ala Ala He Thr Val 
100 105 HO 

Phe Ser Trp He Tyr Lys Tyr Ala Thr Gly Glu His Pro Gin Gly Ser 
115 120 125 

Asp Lys Leu Asp Ser Ala Arg Met Lys Leu Gly Ser Lys Ala Gin Aso 
130 135 140 

Leu Lys Asp Arg Ala Gin Tyr Tyr Gly Gin Gin His Thr Gly Gly Glu 
14 5 150 155 160 

His Asp Arg Asp Arg Thr Arg Gly Gly Gin His Thr Thr 

165 170 

INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 141 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 

Met Ala Asp Gin Thr Arg Thr His His Glu Met He Ser Arg Asp Ser 
1 5 10 ' 15 

Thr Gin Glu Ala His Pro Lys Ala Arg Gin Trp Val Lys Ala Ala Thr 
20 25 30 

Ala Val Thr Ala Gly Gly Ser Leu Leu Val Leu Ser Gin Leu Thr Leu 
35 40 45 

Ala Gly Thr Val He Ala Leu Thr Val Ala Thr Pro Leu Leu VaJ He 
50 55 60 

Phe Ser Pro Val Leu Val Pro Ala Val Val Thr Val Ala Leu He He 
65 70 75 80 

Thr Gly Phe Leu Ala Ser Gly Gly Phe Gly He Ala Ala He Thr Ala 

85 90 95 
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Phe Ser Trp Leu Tyr Arg His Trp Thr Gly Ser Gly Ser Asp Lys He 
100 105 110 

Glu Trp Ala Arg Met Lys Val . Gly Ser Arg Val Gin Asp Thr Lys Tyr 
115 120 ~ 125 

Gly Gin His Trp He Gly Val Gin His Gin Gin Val Ser 
130 135 140 

INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 199 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Met Ala Asp Thr His Arg Val Asp Arg Thr Asp Arg His Phe Gin Phe 
15 10 15 

Gin Ser Pro Tyr Glu Gly Gly Arg Gly Gin Gly Gin Tyr Glu Gly Asp 
20 25 30 

Arg Gly Tyr Gly Gly Gly Gly Tyr Lys Ser Met Met Pro Glu Ser Glv 
35 40 45 

Pro Ser Ser Thr Gin Val Leu Ser Leu Leu He Gly Val Pro Val Val 
50 55 60 

Gly Ser Leu He Ala Leu Ala Gly Leu Leu Leu Ala Gly Ser Val lie 
65 70 75 ' 80 

Gly Leu Met Val Ala Leu Pro Leu Phe Leu He Phe Ser Pro Val He 

85 90 95 

Val Pro Ala Gly Leu Thr He Gly Leu Ala Met Thr Gly Phe Leu Ala 
100 105 110 

Ser Gly Met Phe Gly Leu Thr Gly Leu Ser Ser He Ser Trp Val Met 
115 120 125 

Asn Tyr Leu Arg Gly Thr Ala Arg Thr Val Pro Glu Gin Leu Glu Tyr 
130 135 140 

Ala Lys Arg Arg Met Ala Asp Ala Val Gly Tyr Ala Gly Gin Lys Gly 
145 150 155 160 

Lys Glu Met Gly Gin His Val Gin Asn Lys Ala Gin Asp Val Lys Gin 

165 170 175 
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Tyr Asp He Ser Lys Pro His Asp Thr Thr Thr Lys Gly His Glu Thr 
180 185 190 

Gin Gly Gly Thr Thr Ala Ala 
195 

INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 199 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Met Ala Asp Thr His Arg Val Asp Arg Thr Asp Arg His Phe Gin Phe 
15 10 15 

Gin Ser Pro Tyr Glu Gly Gly Arg Gly Gin Gly Gin Tyr Glu Gly Asp 
20 25 ' 30 

Arg Gly Tyr Gly Gly Gly Gly Tyr Lys Ser Met Met Pro Glu Ser Gly 
35 40 45 

Pro Ser Ser Thr Gin Val Leu Ser Leu Leu He Gly Val Pro Val Val 
50 55 60 

Gly Ser Leu He Ala Leu Ala Gly Leu Leu He Ala Gly Ser Val He 
65 70 75 " 80 

Gly Leu Met Val Ala Leu Pro Leu Phe Leu He Phe Ser Pro Val He 

85 90 95 

Val Pro Ala Ala Leu Thr He Gly Leu Ala Met Thr Gly Phe Leu Ala 
100 105 110 

Ser Gly Met Phe Gly Leu Thr Gly Leu Ser Ser He Ser Trp Val Met 
115 120 125 

Asn Tyr Leu Arg Gly Thr Arg Arg Thr Val Pro Glu Gin Leu Glu Tvr 
130 135 140 

Ala Lys Arg Arg Met Ala Asp Ala Val Gly Tyr Ala Gly Gin Lys Glv 
!45 150 155 160 

Lys Glu Met Gly Gin His Val Gin Asn Lys Ala Gin Asp Val Lys Gin 

165 170 175 

Tyr Asp He Ser Lys Pro His Asp Thr Thr Thr Lys Gly His Glu Thr 
180 185 190 
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Gln Gly Arg Thr Thr Ala Ala 
195 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1267 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



GAGCTCGATC 


ACACAAAGAA 


AACGTCAAAT 


GGATCATACT 


GGGCCCATTT 


TGCAGACCAA 


60 


GAGAAAGTGA 


GAGAGAGTTG 


TCCTCTCGTT 


ATCAAGTAAC 


AGTAGACCAC 


CACTAAACCG 


120 


CCAATAGCTT 


ATAATCAAAA 


TAGAAAGGTC 


TAATAACAGA 


AACAAATGAA 


AAAGCCTTGT 


180 


TCCATGGACT 


GCCTACCCGA 


ATTGATTGAT 


TCGACTAGTT 


TTTCTTCTTC 


TTTGATTAAG 


240 


ACCTCCGTAA 


GAAAAATGGT 


ACTACTAAAG 


CCACTCGCTA 


CCAAAACTAA 


ACCATTCCAG 


300 


ACTGTAACTG 


GACCAATATT 


TCTAAACTGT 


AACCAGATCT 


CAAACATATA 


AACTAATTAA 


360 


GAACTATAAC 


CATTAACCGT 


AAAAATAAAT 


TTACTACAGT 


AAAAAATTAT 


ACTAATTTCA 


420 


GCTATGATGG 


AATTTCAGCT 


CTTAAGAGTT 


GTGGAAATCA 


AGTAAACCTA 


AAATCCTAAT 


480 


AATATTCTTC 


ATCCTTATTT 


TTGTTTCACA 


TGCATGCTGT 


CCAATCTGTT 


ATTAGCATTT 


540 


GAAAGCCTAA 


AATTCTATAT 


ACAGTACAAT 


AAATCTAATT 


AATTTTCATT 


ACTAATAAAA 


600 


TGCTTCATAT 


ATACTCTTGT 


ATTTATAAAT 


CATCCGTTAT 


CGTTACTATA 


CCTTTATACA 


660 


TCATCCTACA 


TTCATACCTA 


AGCTAGCAAA 


GCAAACTACT 


AAAAGGGTCG 


TCAACGCAAG 


720 


TTATTTGCTA 


GTTGGTGCAT 


ACTACACACG 


GCTACGGCAA 


CATTAAGTAA 


CACATTAAGA 


780 


GGTGTTTTCT 


TAATGTAGTA 


TGGTAATTAT 


ATTTATTTCA 


AAACTTGGAT 


TAGATATAAA 


840 


GGTACAGGTA 


GATGAAAAAT 


ATTTGGTTAG 


'CGGGTTGAGA 


TTAAGCGGAT 


ATAGGAGGCA 


900 


TATATACAGC 


TGTGAGAAGA 


AGAGGGATAA 


ATACAAAAAG 


GGAAGGATGT 


TTTTGCCGAC 


960 


AGAGAAAGGT 


AGATTAAGTA 


GGCATCGAGA 


GGAGAGCAAT 


TGTAAAATGG 


ATGATTTGTT 


1020 


TGGTTTTGTA 


CGGTGGAGAG 


AAGAACGAAA 


AGATGATCAG 


GTAAAAAATG 


AAACTTGGAA 


1080 


ATCATGCAAA 


GCCACACCTC 


TCCCTTCAAC 


ACAGTCTTAC 


GTGTCGTCTT 


CTCTTCACTC 


1140 


CATATCTCCT 


TTTTATTACC 


AAGAAATATA 


TGTCAATCCC 


ATTTATATGT 


ACGTTCTCTT 


1200 
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AGACTTATCT CTATATACCC CCTTTTAATT TGTGTGCTCT TAGCCTTTAC TCTATAGTTT 
TAGATAG 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
GTAATACGAC TCACTATAGG GC 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
GGGGATCCTA TACTAAAACT ATAGAGTAAA GG 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Trp lie Gly His Asp Ala Gly His 
1 5 
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(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Asn Val Gly His Asp Ala Asn His 
1 5 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Val Leu Gly His Asp Cys Gly His 
1 5 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Val lie Ala His Glu Cys Gly His 
1 5 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 
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(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Val lie Gly His Asp Cys Ala His 
1 5 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Val Val Gly His Asp Cys Gly His 
1 5 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : s ingl e 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

His Asn Ala His His 
1 5 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : s ingl e 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

His Asn Tyr Leu His His 
1 5 
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(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : s ingl e 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 

His Arg Thr His His 
1 5 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

His Arg , Arg His His 
1 5 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

His Asp Arg His His 
1 5 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: 

His Asp Gin His His 
1 5 



(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

His Asp His His His 
1 5 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

His Asn His His His 
1 5 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

Phe Gin lie Glu His His 
1 5 
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INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

His Gin Val Thr His His 
1 5 



INFORMATION FOR SEQ ID NO : 3 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

His Val lie His His 
1 5 



INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5" amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

His Val Ala His His 
1 5 



INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

His lie Pro His His 
1 5 

(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

His Val Pro His His 
1 5 

(2) INFORMATION FOR SEQ ID NO: 35:, 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1941 base pairs 

(B) , TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

GAGCTCGATC ACACAAAGAA AACGTCAAAT GGATCATACT GGGCCCATTT TGCAGACCAA 6 0 

GAGAAAGTGA GAGAGAGTTG TCCTCTCGTT ATCAAGTAAC AGTAGACCAC CACTAAACCG 120 

CCAATAGCTT ATAATCAAAA TAGAAAGGTC TAATAACAGA AACAAATGAA AAAGCCTTGT 18 0 

TCCATGGACT GCCTACCCGA ATTGATTGAT TCGACTAGTT TTTCTTCTTC TTTGATTAAG 240 

ACCTCCGTAA GAAAAATGGT ACTACTAAAG CCACTCGCTA CCAAAACTAA ACCATTCCAG 3 00 

ACTGTAACTG GACCAATATT TCTAAACTGT AACCAGATCT CAAACATATA AACTAATTAA 3 60 

GAACTATAAC CATTAACCGT AAAAATAAAT TTAGTACAGT AAAAAATTAT ACTAATTTCA 42 0 

GCTATGATGG AATTTCAGCT CTTAAGAGTT GTGGAAATCA AGTAAACCTA AAATCCTAAT 480 

AATATTCTTC ATCCTTATTT TTGTTTCACA TGCATGCTGT CCAATCTGTT ATTAGCATTT 54 0 

GAAAGCCTAA AATTCTATAT ACAGTACAAT AAATCTAATT AATTTTCATT ACTAATAAAA 600 
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TGCTTCATAT 


ATACTCTTGT 


ATTTATAAAT 


CATCCGTTAT 


CGTTACTATA 


CCTTTATACA 


660 


TCATCCTACA 


TTCATACCTA 


AGCTAGCAAA 


GCAAACTACT 


AAAAGGGTCG 


TCAACGCAAG 


720 


TTATTTGCTA 


GTTGGTGCAT 


ACTACACACG 


GCTACGGCAA 


CATTAAGTAA 


CACATTAAGA 


780 


GGTGTTTTCT 


TAATGTAGTA 


TGGTAATTAT 


ATTTATTTCA 


AAACTTGGAT 


TAGATATAAA 


840 


GGTACAGGTA 


GATGAAAAAT 


ATTTGGTTAG 


CGGGTTGAGA 


TTAAGCGGAT 


ATAGGAGGCA 


900 


TATATACAGC 


TGTGAGAAGA 


AGAGGGATAA 


ATACAAAAAG 


GGAAGGATGT 


TTTTGCCGAC 


960 


AGAGAAAGGT 


AGATTAAGTA 


GGCATCGAGA 


GGAGAGCAAT 


TGTAAAATGG 


ATGATTTGTT 


1020 


TGGTTTTGTA 


CGGTGGAGAG 


AAGAACGAAA 


AGATGATCAG 


GTAAAAAATG 


AAACTTGGAA 


1080 


ATCATGCAAA 


GCCACACCTC 


TCCCTTCAAC 


ACAGTCTTAC 


GTGTCGTCTT 


CTCTTCACTC 


1140 


CATATCTCCT 


TTTTATTACC 


AAGAAATATA 


TGTCAATCCC 


ATTTATATGT 


ACGTTCTCTT 


1200 


AGACTTATCT 


CTATATACCC 


CCTTTTAATT 


TGTGTGCTCT 


TAGCCTTTAC 


TCTATAGTTT 


1260 


TAGATAGACA 


TGGCGAATGT 


GGATCGTGAT 


CGGCGTGTGC 


ATGTAGACCG 


TACTGACAAA 


1320 


CGTGTTCATC 


AGCCAAACTA 


CGAAGATGAT 


GTCGGTTTTG 


GTGGCTATGG 


CGGTTATGGT 


1380 


GCTGGTTCTG 


ATTATAAGAG 


TCGCGGCCCC 


TCCACTAACC 


AAGTATTTTT 


GTGGTCTCTT 


1440 


TAGTTTTTCT 


TGTGTTTTCC 


TATGATCACG 


CTCTCCAAAC 


TATTTGAAGA 


TTTTCTGTAA 


1500 


ATTCATTTTA 


AACAGAAAGA 


TAAATAAAAT 


AGTGAAGAAC 


CATAGGAATC 


GTACGTTACG 


1560 


TTAATTATTT 


CCTTTTAGTT 


CTTAAGTCCT 


AATTAGGATT 


CCTTTAAAAG 


TTGCAACAAT 


1620 


CTAATTGTTC 


ACAAAATGAG 


TAAAGTTTGA 


AACAGATTTT 


TATACACCAC 


TTGCATATGT 


1680 


TTATCATGGT 


GATGCATGCT 


TGTTAGATAA 


ACTCGATATA 


ATCAATACAT 


GCAGATCTTG 


1740 


GCACTTATAG 


CAGGAGTCCA 


TTGGTGGCAC 


ACTGCTAACC 


CTAGCTGGAC 


TCACTCTAGC 


1800 


CGGTTCGGTG 


ATCGGCTTGC 


TAGTCTCCAT 


ACCCCTCTTC 


CTCCTCTTCA 


GTCCGGTGAT 


1860 


AGTCCCGGCG 


GCTCTCACTA 


TTGGGCTTGC 


TGTGACGGGA 


ATCTTGGCTT 


CTGGTTTGTT 


1920 


TGGGTTGACG 


GGTCTGAGCT 


C 








1941 
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What is claimed is: 

1. An isolated nucleic acid encoding an 
oleosin 5' regulatory region which directs seed- 
specific expression selected from the groups 
consisting of the nucleotide sequence set forth in SEQ 
ID NO: 12, the nucleotide sequence set forth in SEQ ID 
NO: 12 having an insertion, deletion, or substitution 
of one or more nucleotides, or a contiguous fragment 
of the nucleotide sequence set forth in SEQ ID NO: 12. 

2. An expression cassette which comprises 
the oleosin 5' regulatory region of Claim 1 operably 
linked to at least one of a nucleic acid encoding a 
heterologous gene or a nucleic acid encoding a 
sequence complementary to a native plant gene. 

3 . The expression cassette of Claim 2 
wherein the heterologous gene is at least one of a 
fatty acid synthesis gene or a lipid metabolism gene* 

4. The expression cassette of Claim 3 
wherein the heterologous gene is selected from the 
group consisting of an acetyl -coA carboxylase gene, a 
ketoacyl synthase gene, a malonyl transacylase gene, a 
lipid desaturase gene, an acyl carrier protein (ACP) 
gene, a thioesterase gene, an acetyl transacylase 
gene, or an elongase gene. 

5 . The expression cassette of Claim 4 
wherein the lipid desaturase gene is selected from the 
group consisting of a A6 - desaturase gene, a A12- 
desaturase gene, and a A15 - desaturase gene. 

6 . An expression vector which comprises the 
expression cassette of any one of Claims 2-5. 
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7. A cell comprising the expression 
cassette of any one of Claims 2-5. 

8. A cell comprising the expression vector 
of Claim 6 . 

9. The cell of Claim 7 wherein said cell is 
a bacterial cell or a plant cell. 

10. The cell of Claim 8 wherein said cell 
is a bacterial cell or a plant cell. 

11. A transgenic plant comprising the 
expression cassette of any one of Claims 2-5. 

12 . A transgenic plant comprising the 
expression vector of Claim 6. 

13. A plant which has been regenerated from 
the plant cell of Claim 9. 

14. A plant which has been regenerated from 
the plant cell of Claim 10. 

15. The plant of Claim 12 or 13 wherein 
said plant is at least one of a sunflower, soybean, 
maize, cotton, tobacco, peanut, oil seed rape or 
Arabidopisis plant. 

16. Progeny of the plant of Claim 11 or 12. 

17. Seed from the plant of Claim 11 or 12. 

18. A method of producing a plant with 
increased levels of a product of a fatty acid 
synthesis gene or a lipid metabolism gene which 
comprises : 

(a) transforming a plant cell with an 
expression vector comprising the isolated nucleic acid 
of Claim 1 operably linked to at least one of an 
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isolated nucleic acid coding for a fatty acid 
synthesis gene or a lipid metabolism gene; and 

(b) regenerating a plant with increased 
levels of the product of said fatty acid synthesis or 
said lipid metabolism gene from said plant cell. 

19. A method of producing a plant with 
increased levels of gamma linolenic acid (GLA) content 
which comprises: 

(a) transforming a plant cell with an 
expression vector comprising the isolated nucleic acid 
of Claim 1 operably linked to a A6 - desaturase gene; 
and 

(b) regenerating a plant with increased 
levels of GLA from said plant cell. 

20. The method of Claim 19 wherein said A6 - 
desaturase gene is at least one of a cyanobacterial 

A6 - desaturase gene or a Borage A6 - desaturase gene. 

21. The method of any one of Claims 18-20 
wherein said plant is a sunflower, soybean, maize, 
tobacco, cotton, peanut, oil seed rape or Arabidopsis 
plant . 

22. The method of Claim 18 wherein said 
fatty acid synthesis gene or said lipid metabolism 
gene is at least one of a lipid desaturase, an acyl 
carrier protein (ACP) gene, a thioesterase gene an 
elongase gene, an acetyl transacylase gene, an acetyl - 
coA carboxylase gene, a ketoacyl synthase gene, or a 
malonyl transacylase gene. 

23. A method of inducing production of at 
least one of gamma linolenic acid (GLA) or 
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octadecatetraeonic acid (OTA) in a plant deficient or 
lacking in GLA which comprises transforming said plant 
with an expression vector comprising an the isolated 
nucleic acid of Claim 1 operably linked to a A6 - 
desaturase gene and regenerating a plant with 
increased levels of at least one of GLA or OTA. 

24. A method of decreasing production of a 
fatty acid synthesis or lipid metabolism gene in a 
plant which comprises: 

(a) transforming a plant cell with an 
expression vector comprising the isolated nucleic acid 
of Claim 1 operably linked to a nucleic acid sequence 
complementary to a fatty acid synthesis or lipid 
metabolism gene; and 

(b) regenerating a plant with decreased 
production of said fatty acid synthesis or said lipid 
metabolism gene. 

25. A method of cosuppressing a native 
fatty acid synthesis or lipid metabolism gene in a 
plant which comprises: 

(a) transforming a cell of the plant with an 
expression vector comprising the isolated nucleic acid 
of Claim 1 operably linked to a nucleic acid sequence 
encoding a fatty acid synthesis or lipid metabolism 
gene native to the plant; and 

(b) regenerating a plant with decreased 
production of said fatty acid synthesis or said lipid 
metabolism gene. 
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